Sunday, November 29, 2020

HelloTwo

Overview

Kickstarter promo image, ©Hello
In 2014 a startup called Hello launched their first product on kickstarter the "Sense". This little orb of intricate injection moulded plastic was easily the most fully featured sleep tracking device available to consumers. The business model was solely based around sales with no ongoing service fees. This made the business very unsustainable. After attracting quite a lot of success and breaking into mainstream retail sales Hello folded in 2017

Unfortunately, this left the hardware unsupported with no servers to process it's data. The Hello company did initially offer some support for the idea of open sourcing support for their devices but this never got off the ground. Fast forward to 2020 and it has become obvious no one is going to get the original hardware working again. It simply wouldn't make sense in the days of $4 WiFi enabled MCUs to put the effort into reverse engineering the software stack on the Sense. What does make sense is scavenging some of the amazing industrial design and sensors and building a new data logging solution.

The original Sense offers 6 environmental sensors: illumination, humidity, temperature (HTU21D-F), sound, dust (Sharp GP2Y10), and motion, via a bluetooth dongle. I'm only interested in air quality so my work only involves those sensors.

Hardware

Before tearing my device to pieces, I had a look around the internet and stumbled onto Lyndsay Williams' Blog. Lyndsay did a great teardown and inspection of the Sense back in 2015 and his images were a great help to me.

Power board pictured later
To summarize, the Sense is built with 4 PCBs. From bottom up: power conditioning, LEDs, processor, and finally sensor board.

Between the LED board and the processor sits the Sharp GP2Y10 dust sensor.

To avoid self heating, the HTU21D-F is mounted at the bottom of the device near the input air vents. The intent seems that any heat generated in the device will create an updraft keeping the temperature and humidity values reasonably true. Reviewing the layout of the sensors shows that for this project only the power conditioning board is needed. All the others can be removed. This leaves plenty of space to fit a NodeMCU board inside the Sense.

In order to power the replacement MCU and talk to the HTU21D-F, there needs to be several connections to the power conditioning board. Originally, these were made with a flex PCB ribbon cable. Not having access to an appropriate header, the obvious solution was to take inspiration from the insanity that is the Wii Modding community and break out the magnet wire.

Using a multimeter in continuity mode, I probed around the power board and found pads which carried the I2C signals from the HTU21D-F and the USB power signals. The same process could be carried out to interface with other sensors in the device, but as previously mentioned I was only interested in air quality. After soldering magnet wire to each of the pads, I applied a couple of dabs of enamel nail polish to relieve the wires and avoid having to do any later rework.


Wiring to the Sharp GP2Y10 was somewhat simpler. Again the connection is made with a ribbon cable. However, in the case of the Sharp sensor this is a regular wire cable. The MCU end of this cable can be snipped off and soldered to. The Sharp GP2Y10 is a really frustrating device. It's apparently calibrated at factory but relies on two external components (a resistor and capacitor) to control the sensing and readout. This means that the sensor can never be more calibrated than the tolerance of the external components. It would have been extremely easy for Sharp to have integrated these and calibrated the system as a whole. With the magnet wire mess already well underway from interfacing the temperature sensor, I decided to simply float the support components for the Sharp sensor.


With all the sensors wired up, I powered up the board by the USB port on the NodeMCU and tested out my interface code. It turned out that several of the digital pins broken out the NodeMCU have odd reservations and the trigger pin for the Sharp sensor had to be moved. 

After this was complete the sensors worked as expected. The only issue was the really intense LED on the NodeMCU. 

The Sense was designed to include a ring of LEDs and as such, the whole thing lights up blue. That wasn't desirable so the LED was quickly desoldered. After carefully reassembling the device, the hardware work was complete. (I did not bring the USB data lines from the power board up to the NodeMCU so the device must be opened to update the firmware via USB).


Software

For collecting and monitoring the data I looked into a lot of solutions. For this project I wanted long term data storage, low cost, and MQTT as the data transfer mechanism. Adafruit and Particle offer hosted services which meet two of these conditions, but fail to provide the full package. Turning to DIY solutions ELK(Elasticsearch, Logstash, and Kibana) was my first thought. However, while searching around on GitHub I found similar datalogging project by Nilhcem. He uses MQTT, InfluxDB, and Grafana. This is a much better stack and thanks to Nilhcem, you can stand up his logging solution from a Docker Compose file in about 5 minutes. A couple of simple Grafana graph configurations later, and you should have a dashboard like this.

If you have a Hello Sense sitting around and want to replicate what I built, you can pull the Arduino code from my GitHub here: https://github.com/Bostwickenator/HelloTwo.


Troubleshooting

  • Pin reservations on the NodeMCU. GPIO 9 and 10 are special and you need to be careful using them http://smarpl.com/content/esp8266-esp-201-module-freeing-gpio9-and-gpio10
  • Timing for the Sharp GP2Y10 is very dependant on capacitor value. I suggest graphing the sensor response and picking the highest point of the curve for your individual circuit instead of relaying on the 280 microseconds specified in the datasheet.
  • Docker for windows does not boot on startup. solution on StackOverflow
  • Accidentally using 3 instead of D3 for a pin operation on the NodeMCU will compile and will hard reset your device each time it is executed.
  • Self heating. The NodeMCU produces a surprising amount of heat. With the device in full powered up state the temperature reported was 3 degrees centigrade higher than the true value. To combat this I put the WiFi module into forced power down this reduced the temperature offset to 1 degree which I compensated for in software.
  • The Sharp GP2Y10 response values are incredibly noisy. The values bounce around with a standard deviation of at least 3 bits in the 10 bit ADC on the ESP8266. To get around this, I had to collect a lot of samples and apply a Gaussian filter 
  • The Sparkfun library for the HTU21D-F crashes the ESP8266.The Adafruit library works perfectly.

Saturday, September 19, 2020

Zoom like it's 1988 with the Mitsubishi VisiTel

Well here we are in 2020, remote work is finally here and after decades of marketing by every major telecommunication company, my mum knows how to make a video call. Unfortunately this sudden enthusiasm caused unprecedented shortages of webcams. We can't let this slow us down as there are important calls to be made and I have the solution. Introducing the Mitsubishi VisiTel!


This definitely was The Future™, and for the low low price of $399 how could this not catch on immediately?! Popular Mechanics Feb 1988 carried a full page description with a surprising amount of technical detail. Meanwhile the hosts of Gadget Guru on WSMV asked the most profound question about video calling within seconds of being introduced to it "Can I still use the phone without this?".

Well it has taken almost 30 years and a global pandemic, but we finally have accepted the inevitable fact that we are going to have to take a shower before talking to our colleagues on the phone. Video telephony is here in force. 

I am not the first person to think there is something specially kitsch about the VisiTel. People have been tinkering around with these for a while and despite not having motion video, it is one of those products which was clearly a feat of engineering and massively ahead of its time.

Japhy Riddle's video is what got me really interested in the VisiTel. He pointed out that by slowing down or speeding up the recording of an image sent to the VisiTel the image skewed on the screen. Immediately I knew this meant the device used a simple AM modulation scheme. With that knowledge in hand, I jumped on eBay and picked up a unit to play with. My goal: to make this glorious time capsule of late 80s tech work with 2020's work from home platform of choice, Zoom, so I can chat with my colleagues. Moreover I want to do this without modifying the hardware because these things are just too cool to butcher.

Step 1: Electrical interfacing

The VisiTel is extremely easy to set up. On the back of the device is one long cable with a Y joint near the end (more on this in a minute) and two plugs. One plug is a 2.1mm DC barrel jack with which the device is powered. It expects 15 volts center positive. Oddly they chose to write the number 15 on the device in a font that makes it appear to read "IS Voltage". The second plug is an RJ11 connector with two of the pins populated. This is the standard layout for a telephone handset. If you remember corded phones you might also remember that the receiver was connected to the dialy bit (technical term) by means of an RJ9 connector and the dialer to the wall with an RJ11 connecter. The intent being that you could plug other accessories into your dialer, or replace the receiver | cable if something went wrong. Not that the latter problem would happen often because telephone handsets used a special kind of wire which is extra flexy.


Returning to the back of the VisiTel, along with the cable there is also a RJ11 socket. After scratching my head for a while the obvious dawned on me. The VisiTel is designed to man in the middle your telephone so that it can listen or transmit whilst you make calls. That leads to an obvious problem, as the VisiTel unit is sharing the wiring in parallel with the telephone the user is going to hear images being received and transmitted. Well the engineers at Mitsubishi thought of that and whenever you receive an image or transmit one there is a loud click from the device as a relay disconnects the handset. For our application we aren't interested in listening to the audio signal, so we aren't going to plug anything into that RJ11 socket on the back of the device. This makes the isolation relay superfluous meaning it could be removed for clickless image sending. 

So how do we communicate with this device? My computer for one doesn't have an RJ11 socket on it. No worries, apparently there is a market for adaptors to connect your RJ9 telephone handsets to smartphones. Or you can simply snip off the RJ11 connector and solder on a mono 3.5mm jack. Note: I originally thought the VisiTel had an RJ9 connector not an RJ11.

With your 3.5mm jack in place connecting couldn't be simpler. Get a USB sound adaptor with separate microphone and headphone output jacks (not TRRS) and plug the VisiTel into the appropriate socket.

I said that I'd get back to the Y joint on that cable. On my device the wires in that joint had failed, not a nice solid "it's broken" failure, but one of those nasty types of failure where everything works when you go to bed and the next morning it doesn't. If only the VisiTel team had used some of that special extra flexible wire. This intermittent failure cost me days of staring blankly at screens, fruitlessly looking for thermal issues, and triple checking my sound card settings. Eventually I discovered the failure when in frustration I picked the device up during a test and saw a crackle of static on my recording. If there is a lesson about old hardware to be learned here it might be to perform a "jostle" test once you have everything working and see if anything suddenly doesn't. 


Step 2: Understand the protocol

Dodgy wires aside now that we have the the hardware interfaced with the computer it's time to dig into what the protocol looks like. As we can tell from the videos and articles about the VisiTel, the image is encoded using AM modulation on an audio frequency carrier. AM stands for Amplitude Modulation and describes the process of encoding information in the amplitude (volume) of a waveform. Normally it is used to encode audio onto radio waves, however there is no reason you can't encode images on to radio waves or even images onto audio waves. @nabanita.sarkar has a great description of this process with more detail and some Python example code for people who learn by doing.

Knowing the data was AM encoded in the audio domain, the first step was to record the transcription of some sample images with known properties. To capture these I used my go application for audio processing Audacity.


As you can see above I wanted to validate my understanding of how data was encoded, so I sent some very simple sample images from the VisiTel. These were made by holding colored pieces of card in front of the camera one piece black and one piece white.

The input image for the trace labeled blackWhiteV looked roughly like this.


Let us examine the result a little closer



warning loud 1750Hz tone

As you can see, there appears to be some kind of initialization / header / preamble at the beginning of each image @17.55 to @17.80 in this example. This allows the receiving device to detect that an image is about to be sent and determine the maximum amplitude of the signal. This maximum amplitude is then used as a scale factor during image decoding so that a lossy quiet phone line doesn't make the images being transferred lose contrast or brightness. Note: the signal you can see before @17.55 is simply noise on the line and isn't important for the operation of the VisiTel protocol. 

To check that this header establishes an image that is about to be sent, I played just this small section of audio to the VisiTel and was rewarded with the sound of the relay clicking on. Clearly this triggers something. However, when sending just the 30 milliseconds of header the VisiTel seems to detect that there is no image data being sent and turns the relay back off a few milliseconds after the header ends. Playing the header and the first few lines of the image causes the VisiTel to start drawing an image on the screen. If you stop the audio while an image is being drawn the VisiTel continues receiving until the image buffer is full. This once again shows that once the VisiTel is reading pixels it doesn't rely on an external oscillator to know when to sample data. It has an internal clock source telling it when to inspect the waveform.

As the header simply establishes that a connection is being made and doesn't need to change along with the image, I didn't feel the need to delve much deeper into how it works. Knowing that the header establishes a connection and roughly what it looks like is enough for our purposes.

Moving on to the most difficult part, the pixel format. The first step was to figure out how the pixel data is modulated onto the carrier wave. My first hunch is that each complete wave represents one pixel. I checked this by counting the number of waves between repeats of my test pattern. Indeed it matches what the old advertisements say. 96x96 pixels with a few lines drawn before the visible image starts.

With this we know the amplitude of each wave is sampled and drawn into one pixel of the digital image buffer contained in the VisiTel. We know from the marketing material that pixels have 16 shades of grey, however as I am evaluating pixels from an 'analog' waveform I felt there was no need to posterize them in my decoding or encoding. 

Interestingly the pixel brightnesses are inverted before modulation so the largest waves represent the darkest pixels, they are also mirrored left to right to give a mirror image. I'd love to know any reader theories on why the brightness inversion may have been done. My suspicion is that the human eye is more forgiving of a random black pixel than a random white pixel and line noise is something which the VisiTel surely had to contend with a lot back in 1988. 


There is however an exception to this encoding scheme which eluded me for weeks. In this inverted encoding scheme, completely white pixels should be represented as quiescence, no carrier wave should appear on the output at all. The VisiTel designers seem to have not liked this idea. Instead to encode a completely white pixel the carrier wave is offset by 1/4 a wavelength putting it completely out of phase with the wave in the "default" state. This way the carrier wave can still be sent down the line. At the receiving end the device receiving this signal is still synchronized with the "default" waveform and will sample the amplitude right as the wave passes through the 0 point correctly giving a white pixel. As I previously mentioned, if the audio is cut off halfway through sending an image the VisiTel continues drawing white pixels until the frame buffer is full; so clearly silence can be read as white without the presence of the carrier wave. I have no idea why the designers decided to complicate their modulation scheme with this out of phase mode. It seems like a huge amount of effort to go to for little to no gain.

Not having known about this phase shift modulation, my initial approach to decoding the images involved looking for the maxima of each wave and drawing that as a pixel. Line steps were emitted after 96 * the number of samples per wave. This lead to some lines having slightly too many or two few pixels. Additionally this crude approach was very sensitive to noise as small spikes in the wave could cause additional pixels to be output. It was however quite simple to implement and this simple implementation was able to decode images without having to synchronize clocks. The results unfortunately just weren't usable.

In order to decode images more accurately, we have to do what the VisiTel does and synchronize with the signal during the header and then continue sampling at regular intervals after this. As you may expect, this requires extremely precise timing. For my unit at least when recording at 44100Hz there are 25.23158 samples per wave (meaning the carrier wave is at 1747.80968929Hz). So each time we read a pixel we set the index of the next target 25.23158 pixels further into the audio buffer. As we only have audio samples at integer offsets, we simply round to the nearest sample and take that one. The key point is we don't allow this rounding error to accumulate as the sampling position would quickly get out of phase with the wave. Getting just a little out of phase results in some intense artifacts as shown below.


Thankfully this sample per wave value appears to be very stable. It does not change as the unit warms up which I had been concerned would present an issue. This invariance allowed me to hardcode a value into the decoding logic. Ideally the samples per wave value would be derived from the header data but I found there are not enough samples in order to reach the 5 decimal points of precision with my implementation and the hard coded value works robustly. With this more accurate implementation line steps are simply emitted after the draw pixel function has been called 96 times.

Up until now I have been working with pre recorded blocks of audio decoded from WAV files. In order to operate live and decode images as they are sent to an audio interface, the decoder needs to be able to detect the header and calculate where the image data starts. Looking at the header it has three distinct phases which are easy to detect. 


Carrier wave --> "Silence" --> Carrier wave

In order to find these I implemented a simple FFT based detector and a finite state machine. First the waveform is transformed into the frequency domain. The detector then inspects each block of audio until it finds the carrier a strong signal at 1747Hz. Blocks are then inspected until the signal disappears and returns again. At this point the start of a transmission has been found. From here a simple static offset is used to find the beginning of the image data and the image decoding routines already discussed are utilized. Once an image has been decoded it is displayed and the finite state machine is reset in order to listen for the next transmission.

Step 3: Present the device as a webcam

Now that we have managed to decode the images there is one final piece to this puzzle, presenting the decoded images as video frames to our video conferencing software. On linux this is surprisingly easy. Video input is abstracted using the V4L2 interface, unfortunately this still happens in kernel space. To avoid the complications of making a kernel module you can use V4L2 Loopback. This module presents itself as both an input and an output device. Whatever pixels you publish to it as output it makes available as input for other programs like Zoom. There are even a few Python packages which further abstract this to play nicely with OpenCV images and NumPy. Of those packages I have used pyfakewebcam for this project. The interface couldn't be simpler. To setup the virtual webcam you do the following:

import pyfakewebcam

self.camera = pyfakewebcam.FakeWebcam(self.v4l2_device, 640, 480) 

 and whenever a new frame is available :

self.camera.schedule_frame(output)

With that simple addition to the decoding program success. Zoom can now receive images from the VisiTel and we can video chat like it's 1988.

Going forward I'd love to setup a linux Direct Rendering Manager driver for this so that I can output images onto the VisiTel screen as well as receiving them. For now having the video capture setup and working in Zoom is enough for me to call this project a success. 

I hope you enjoyed that write up. Check out the code on GitHub and please let me know if you setup your own VisiTel webcam https://github.com/Bostwickenator/VisiTel