Ironically, mobile phones being digital makes them easier for many people to understand. The analog circuitry that goes into simply making an analog phone ring is surprisingly complex, let along how they actually function as phones.
Analog audio is a lot less ācomputer nerds programming thingsā and a lot more āscrapped together from some resistors that were ripped out of an old TV. We donāt even know how it turns on, let alone how it functions.ā You can literally build a basic microphone with nothing but a balloon stretched across an embroidery hoop, some copper wire, a small magnet, and some glue. It wouldnāt sound good, but it would function as a microphone in some capacity, and at least be able to detect loud noises. And the same goes for a speaker; You could build one out of a red Solo cup, a magnet, some wire, and some glue. It wouldnāt sound good, but you could at least get a basic āsound is being emitted from thisā level from it. But if you showed that scrapped-together device to someone, theyād have no idea that it was a phone.
Fascinating. Iād say the secret to analog electronics for sound is that sound is waves, electricity is waves, you can translate from one to the other with a resistor and a membrane. The end.
To me itās much more unclear how sound is first encoded into a digital signal, transmitted as a digital signal through wires and radio waves, and then translated back into sound in a phone. I mean itās essentially the same physics as the analog electronics, just with a bunch of extra steps added.
But maybe if one of those steps is ācomputer does thingā people just go like āah yes, computer, makes senseā.
To me itās much more unclear how sound is first encoded into a digital signal, transmitted as a digital signal through wires and radio waves, and then translated back into sound in a phone. I mean itās essentially the same physics as the analog electronics, just with a bunch of extra steps added.
Yeah, this is where sample rate and bit depth come into play. In case youāre curious, digital audio is possible due to the Nyquist-Shannon Sampling Theorem. The TL;DR is that you donāt record a continuous stream of audio data; You just sample the wave at regular intervals by recording the current amplitude. And then you can recreate it on the other end. The theorem states that an analog wave can be perfectly recorded and replicated, as long as you have a sufficiently high sample rate and bit depth. Since human hearing generally tops out at 20kHz, we need to sample the audio signal at least 40k times per second; Most consumer-grade audio equipment uses 44.1 or 48kHz. Phones actually use a much lower sample rate for calls, but more on that later.
Again, as long as your sample rate is at least 2x the rate of the highest frequency being recorded, youāre able to perfectly recreate the wave. For an example, hereās a gif:
The image on the left shows the wave being recorded, and the dots are samples. As you add more samples, the reproduced wave gets more accurate. By the time you have 2x the fastest frequency, there is only one possible wave that will fit every sample. Again, human hearing tops out around 20kHz, so we use a sample rate just above 40kHz.
Phone calls will often put a filter on the high and low ends, and only capture the mid-range. It gives that distinct āthis is shitty phone call qualityā sound, but means they can use a much lower sample rate; Since theyāre lopping off most of the high end with that filter, they may only need a sample rate closer to 15 or 20kHz. Because fewer samples means less data. The intelligibility happens in the mid-range, so thatās what the phone makers (and telecom companies) focus on. This low sample rate is also why hold music sounds so fucking awful. Itās essentially being passed through a āmake this sound as shitty as possible while still being intelligibleā filter.
And then bit depth simply determines how detailed each sample is. If you use 8 bits per sample, that gives you 256 potential values per sample. 12 bits gives you 4096. The trade-off is that a higher bit depth means each sample takes exponentially more data; Audiophiles will generally push for higher bit depths, so each sample is more accurate. In contrast, phone calls often use lower bit depths, (again, to save data).
As for how it actually transmits the data, thatās just 1ās and 0ās. Itās a little more complicated than that, (packets, for example) but in the digital realm, as long as the 1ās and 0ās get to where they need to be, youāre good to go.
Holy shit, so Iām not just uniquely terrible at understanding people on the phone? Iāve searched so long for a phone that does high-quality phone calls, and I canāt believe I never figured that it was a problem with both the phones and the carriers.
Yup itās also why hold music sounds terrible is the sample rate and ranges are so small thereās basically no music which would sound decent over the connection
I studied electronics and GSM was a big part of the telecommunications subject. I visited the HQ of a mobile provider, was shown around and met the cartel boss (in hindsight, I wonder how much a Luigi moment would have affected the triopoly). I also visited a museum of technology and used an early touch-click model still connected to the network (pre-DTMF so not touch-tone, and no buffer so you had to wait for the simulated dial to stop clicking).
But still, I donāt know the basics of wired phones cuz Iāve never really used them. How does voice travel both ways on a single twisted pair? How can Inspector Clouseau the telephone engineer in The Pink Panther (1978) hear a conversation from other phones in the house? How does the exchange know Iāve dialed the last digit? Can I use voice services on rotary phones, and what if I need to press * or #? All these would be obvious to 1980s kidsā¦
How does the exchange know Iāve dialed the last digit?
This is the fun part: they donāt! The exchange just listens for enough tones to make up an instruction then performs what it was instructed to do. Thatās why when you call places itāll say āpress 1 to speak with so-and-soā is youāve now been connection to another exchange which is waiting for instructions on the form of dial tones generated by button presses.
Phreakers figured out ways to generate the tones needed to all sorts of fun things like play the āpayment receivedā tone into a payphone, or to tell the exchange to connect to another exchange that it might not otherwise (and sometimes would chain them together and see how many hops they could achieve before the sheer distance of the call completely destroyed the call) and all sorts of other fun
Yes, I know about phreakers but what I mean is, phone numbers differ by length. Did the exchange wait until no more tones/clicks in a while or is there a variable length acheved by, say, making all area codes start with 0?
The thing with phones is so much is built with backwards compatibility or at least similar design principles that any question like this you have to start at the first automated phone switchboards powered by strowger switches.
A rotary phone would issue a number of pulses as the dial spun. 1 pulse for 1, 3 pulses for 3, etc. Each pulse would trigger a strowger switch at local exchange, where it would start turning a dial on the switch with an equal number of turns to the pulses or receives. For a single digit dial it would just have one switch that rotates with the number of pulses of receives. For a much more common 3-4 digit number being entered it would take the rapid succession of pulses to turn the first dial, then after a sufficient pause any subsequent pulses turn the next dial, and so on. Once it runs out of dials to turn on the switch it connects the call to the line which may go to another phone or may simply go to another strowger switch awaiting additional pulses from the user dialing additional numbers.
For example user dials 5-5-5-1-2-3 with a pause between each digit as they dial. The first 5 sends 5 pulses in quick succession to the switch the phone is directly connected to (the local exchange) and that sets the first dial on the switch to 5, the user naturally pauses for a split second creating the pause the switch interprets as a completed dial then the user enters another 5 causing another 5 pulses to go to the switch at the local exchange. After the third 5 it connects to the next exchange likely without the user even knowing and the 1 is transmitted via the local exchange to the 555 exchange where a stroger switch turns to the 1 position from the single pulse, and so on. All of the switches that connect for such a call remain engaged and connected until they receive the disconnect pulse and then they kill the connection ending the call.
With the transition to DTMF tones, much of this same behavior of each switch is just waiting for exactly how many digits it expects then connects the call to the next place remained, and with modern digital and VoIP calls, they continue to emulate the same functionality of the strowger switches, where the exchange is expecting a specific number of digits to be entered, and the user will either enter the correct number of digits or receive an error and a call that doesnāt connect. There are actually still places operating analog telephone exchanges so everything is ultimately backwards compatible, and the security and design challenges of sending signaling over the same wire as voice have remained all thanks to the cost savings choices made by some dude with an amazing mustache in the 1880s. Or we can go even further back because the telephone was actually an innovation on the telegraph, originally designed as a solution for sending multiple telegraphs over a single telegraph trunk line.
Alexander Graham Bell, a pioneering audiologist who worked with deaf kids excitedly penned a letter to an individual at the Western Union Telegraph Company describing using different tones of beeps over the line to differentiate between different telegraphs, and then excitedly went on to describe how with enough different tones one could not only transmit a nearly infinite number of different telegraphs at once but one could theoretically transmit human speech! Iāve read scans of these original letters and you can just see the excitement building as Bell described that part
But you understand how mobile phones work?
Ironically, mobile phones being digital makes them easier for many people to understand. The analog circuitry that goes into simply making an analog phone ring is surprisingly complex, let along how they actually function as phones.
Analog audio is a lot less ācomputer nerds programming thingsā and a lot more āscrapped together from some resistors that were ripped out of an old TV. We donāt even know how it turns on, let alone how it functions.ā You can literally build a basic microphone with nothing but a balloon stretched across an embroidery hoop, some copper wire, a small magnet, and some glue. It wouldnāt sound good, but it would function as a microphone in some capacity, and at least be able to detect loud noises. And the same goes for a speaker; You could build one out of a red Solo cup, a magnet, some wire, and some glue. It wouldnāt sound good, but you could at least get a basic āsound is being emitted from thisā level from it. But if you showed that scrapped-together device to someone, theyād have no idea that it was a phone.
Fascinating. Iād say the secret to analog electronics for sound is that sound is waves, electricity is waves, you can translate from one to the other with a resistor and a membrane. The end.
To me itās much more unclear how sound is first encoded into a digital signal, transmitted as a digital signal through wires and radio waves, and then translated back into sound in a phone. I mean itās essentially the same physics as the analog electronics, just with a bunch of extra steps added.
But maybe if one of those steps is ācomputer does thingā people just go like āah yes, computer, makes senseā.
Yeah, this is where sample rate and bit depth come into play. In case youāre curious, digital audio is possible due to the Nyquist-Shannon Sampling Theorem. The TL;DR is that you donāt record a continuous stream of audio data; You just sample the wave at regular intervals by recording the current amplitude. And then you can recreate it on the other end. The theorem states that an analog wave can be perfectly recorded and replicated, as long as you have a sufficiently high sample rate and bit depth. Since human hearing generally tops out at 20kHz, we need to sample the audio signal at least 40k times per second; Most consumer-grade audio equipment uses 44.1 or 48kHz. Phones actually use a much lower sample rate for calls, but more on that later.
Again, as long as your sample rate is at least 2x the rate of the highest frequency being recorded, youāre able to perfectly recreate the wave. For an example, hereās a gif:

The image on the left shows the wave being recorded, and the dots are samples. As you add more samples, the reproduced wave gets more accurate. By the time you have 2x the fastest frequency, there is only one possible wave that will fit every sample. Again, human hearing tops out around 20kHz, so we use a sample rate just above 40kHz.
Phone calls will often put a filter on the high and low ends, and only capture the mid-range. It gives that distinct āthis is shitty phone call qualityā sound, but means they can use a much lower sample rate; Since theyāre lopping off most of the high end with that filter, they may only need a sample rate closer to 15 or 20kHz. Because fewer samples means less data. The intelligibility happens in the mid-range, so thatās what the phone makers (and telecom companies) focus on. This low sample rate is also why hold music sounds so fucking awful. Itās essentially being passed through a āmake this sound as shitty as possible while still being intelligibleā filter.
And then bit depth simply determines how detailed each sample is. If you use 8 bits per sample, that gives you 256 potential values per sample. 12 bits gives you 4096. The trade-off is that a higher bit depth means each sample takes exponentially more data; Audiophiles will generally push for higher bit depths, so each sample is more accurate. In contrast, phone calls often use lower bit depths, (again, to save data).
As for how it actually transmits the data, thatās just 1ās and 0ās. Itās a little more complicated than that, (packets, for example) but in the digital realm, as long as the 1ās and 0ās get to where they need to be, youāre good to go.
Holy shit, so Iām not just uniquely terrible at understanding people on the phone? Iāve searched so long for a phone that does high-quality phone calls, and I canāt believe I never figured that it was a problem with both the phones and the carriers.
Yup itās also why hold music sounds terrible is the sample rate and ranges are so small thereās basically no music which would sound decent over the connection
Iāve heard that the pikachu cry in pokemon yellow has a bit depth of 1.
I studied electronics and GSM was a big part of the telecommunications subject. I visited the HQ of a mobile provider, was shown around and met the cartel boss (in hindsight, I wonder how much a Luigi moment would have affected the triopoly). I also visited a museum of technology and used an early touch-click model still connected to the network (pre-DTMF so not touch-tone, and no buffer so you had to wait for the simulated dial to stop clicking).
But still, I donāt know the basics of wired phones cuz Iāve never really used them. How does voice travel both ways on a single twisted pair? How can Inspector Clouseau the telephone engineer in The Pink Panther (1978) hear a conversation from other phones in the house? How does the exchange know Iāve dialed the last digit? Can I use voice services on rotary phones, and what if I need to press * or #? All these would be obvious to 1980s kidsā¦
This is the fun part: they donāt! The exchange just listens for enough tones to make up an instruction then performs what it was instructed to do. Thatās why when you call places itāll say āpress 1 to speak with so-and-soā is youāve now been connection to another exchange which is waiting for instructions on the form of dial tones generated by button presses.
Phreakers figured out ways to generate the tones needed to all sorts of fun things like play the āpayment receivedā tone into a payphone, or to tell the exchange to connect to another exchange that it might not otherwise (and sometimes would chain them together and see how many hops they could achieve before the sheer distance of the call completely destroyed the call) and all sorts of other fun
Yes, I know about phreakers but what I mean is, phone numbers differ by length. Did the exchange wait until no more tones/clicks in a while or is there a variable length acheved by, say, making all area codes start with 0?
The thing with phones is so much is built with backwards compatibility or at least similar design principles that any question like this you have to start at the first automated phone switchboards powered by strowger switches.
A rotary phone would issue a number of pulses as the dial spun. 1 pulse for 1, 3 pulses for 3, etc. Each pulse would trigger a strowger switch at local exchange, where it would start turning a dial on the switch with an equal number of turns to the pulses or receives. For a single digit dial it would just have one switch that rotates with the number of pulses of receives. For a much more common 3-4 digit number being entered it would take the rapid succession of pulses to turn the first dial, then after a sufficient pause any subsequent pulses turn the next dial, and so on. Once it runs out of dials to turn on the switch it connects the call to the line which may go to another phone or may simply go to another strowger switch awaiting additional pulses from the user dialing additional numbers.
For example user dials 5-5-5-1-2-3 with a pause between each digit as they dial. The first 5 sends 5 pulses in quick succession to the switch the phone is directly connected to (the local exchange) and that sets the first dial on the switch to 5, the user naturally pauses for a split second creating the pause the switch interprets as a completed dial then the user enters another 5 causing another 5 pulses to go to the switch at the local exchange. After the third 5 it connects to the next exchange likely without the user even knowing and the 1 is transmitted via the local exchange to the 555 exchange where a stroger switch turns to the 1 position from the single pulse, and so on. All of the switches that connect for such a call remain engaged and connected until they receive the disconnect pulse and then they kill the connection ending the call.
With the transition to DTMF tones, much of this same behavior of each switch is just waiting for exactly how many digits it expects then connects the call to the next place remained, and with modern digital and VoIP calls, they continue to emulate the same functionality of the strowger switches, where the exchange is expecting a specific number of digits to be entered, and the user will either enter the correct number of digits or receive an error and a call that doesnāt connect. There are actually still places operating analog telephone exchanges so everything is ultimately backwards compatible, and the security and design challenges of sending signaling over the same wire as voice have remained all thanks to the cost savings choices made by some dude with an amazing mustache in the 1880s. Or we can go even further back because the telephone was actually an innovation on the telegraph, originally designed as a solution for sending multiple telegraphs over a single telegraph trunk line.
Alexander Graham Bell, a pioneering audiologist who worked with deaf kids excitedly penned a letter to an individual at the Western Union Telegraph Company describing using different tones of beeps over the line to differentiate between different telegraphs, and then excitedly went on to describe how with enough different tones one could not only transmit a nearly infinite number of different telegraphs at once but one could theoretically transmit human speech! Iāve read scans of these original letters and you can just see the excitement building as Bell described that part