– 4 basic components that constitute a self driving car (computer vision, deep learning, robotics, and navigation)
Alright guys, to be honest here, this isn’t my article, i stumbled on it on medium, was a good read, so i thought i should share some good knowledge with you guys.
And to achieve that, i’m gonna rewrite it exactly the same way it was written on medium. (no edits).
Everything about Self Driving Cars Explained for Non-Engineers
I promise you won’t have to use either Google or a dictionary while reading this. In this post I will teach you the core concepts about everything from “deep learning” to “computer vision”. Using dead simple English.
You probably already know what self driving cars are and that they are considered the dope shit these days, so if you don’t mind I’m going to skip any high school essay-ish introduction. 🙂
But I’m not skipping my own introduction: Hi I’m Aman, I’m an engineer, and I have a low tolerance for unnecessarily “sophisticated” talk. I write essays on Medium to make hard things simple. Simplicity is underrated.
How do self-driving cars work?
Also called autonomous cars, they work on the combination of 3 cool fields of technology. Here’s a brief introduction to each of them, and then we will go into depth. By the end of this essay, you will know enough about all these technologies to be able to hold an intelligent conversation with an engineer or investor in these fields. Things like “artificial neural networks” won’t sound like magic spells or sci-fi film words anymore.
By the way, I’ve categorized some things into roughly separate “systems”, but of course in practice these systems are all highly interconnected without clearly defined boundaries.
Computer Vision (ooooooohhhhh sounds so cool)
The technology which allows the car to “see” its surroundings. These are the eyes and ears of the car. This whole system is called Perception. Basically we use:
1. Good old cameras, which are the most important (simple 2 megapixel cameras can work fine),
2. radars, which are second-most important. They throw radio waves around, and like ultrasound you detect the waves that bounce off objects and come back,
3. and lasers, which are cool to have but they are pretty expensive nowadays, and they don’t work when its raining or foggy. Also called “lidar”. You can say they’re like radar but give a little better picture quality, and lasers can go pretty far so you have greater range of view. The lasers are usually placed in a spinning wheel on top of the car so they spin around very fast, looking at the environment around them. Here you can see a Lidar placed on top of the Google car:
It is a technology that allows the car to make driving decisions on its own, based on information it gathered through the computer vision stuff described above. This is what trains the ‘brain’ of the car. We will go into detail about this in a minute.
By combining the two even on a basic level, you can do some interesting things. Here’s a project I made, detecting lane lines and other cars on the road using only a camera feed.
You can see everything, and you can think and make decisions. But if your brain’s decisions (eg: lift the left leg) can’t reach the muscles of the leg, your leg won’t move and you won’t be able to walk. Similarly, if your car has a ‘brain’ (=a computer with deep learning software), the computer needs to connect with your car’s parts to be able to control the car. Put simply, these connections and related functions make up ‘robotics’. It allows you to take the software brain’s decisions and use machinery to actually turn the steering, press and release the throttle, brakes etc.
Even after being given all the above, ultimately you still need to figure out “where” you are on the planet and the directions for where you want to go. There are several aspects to this, like GPS (your good old navigation device, which takes location information from satellites), and stored maps, etc. You also mix in computer vision data.
So the car controls its steering and brakes etc based on the decisions made by its brain, and these decisions are based on the information received through cameras and radars and lasers and the directions it receives from the navigation programs. This completes the whole system of a self-driving car.
Tidbit (feel free to skip)
Self driving cars come in many “levels”, from Level 1 to Level 5, based on how independent the car is and how little human assistance it needs while driving. Oh and there’s also Level 0, which is your good old manually operated car.
Level 5 means the car will be 100% self-driving. It will not have a steering or brakes, because it’s not meant to be driven by people. These cars don’t exist yet, and cutting-edge cars are still at Level 3 and at most Level 4
Deep Learning Explained for Chimpanzees like Myself
Let’s say you were a wildlife safari enthusiast, and I was your super-idiot friend. I’m going for safari in Africa next week. And you give me some advice: “Aman, stay away from the fucking elephants.”
And I ask you back, “What’s an elephant?”
You will most likely say, “You stupid jerk, elephants are… okay never mind, here’s a photograph of an elephant, this is what it looks like. Stay away from them.”
And then I go off to safari.
Next week you get to know that I still managed to run into an elephant and almost ended up getting trampled. You ask me what happened.
I reply, “I don’t know, I did see this huge animal but it didn’t look anything like the photo you showed me, so I thought it was safe to play with and I went ahead and pulled the little wagging thing. Here’s the photograph of the animal I took before that…”
You: “Okay, Aman. I’m sorry, my bad. I expected too much from your brain. Let me give you a “cheat code” which you need to follow when you’re on the safari next time. If you see anything that looks brown-ish from all angles, seems to have four leathery legs like pillars, large flapping ears, and a thick long nose coming out of its face like a big tube, and is fat and bigger than you are, then that’s an elephant and you need to stay away.”
Next month I go back to safari again (hey, it’s my hypothetical story and I can go to safari as many times as I want) and I don’t run into any elephants this time, because your “cheat code” works well.
How did you come up with that “cheat code”? It’s because you’ve already seen an elephant from all different sides, and you picked some features of an elephant which remain pretty same regardless of which angle you view the elephant from. So you had lots of data about elephants to think about, and that helped you to form a mental picture of the most obvious signs of an elephant, and gave them to me as a cheat code. Realize that I don’t have to really “know” exactly what an elephant is, I just have a cheat code that helps me recognize an elephant. But that cheat code works almost as well as knowing what elephants are!
But why wasn’t it okay to just show me one photograph (the one you showed earlier) and assume it was enough for me to get the idea? Because I (being an idiot of course) took that photograph as the “holy truth” — I assumed that every elephant will look *exactly* the same as that photograph, and will be a near perfect match.
Deep Learning works in a VERY similar way.
So guys, personally i’ll be stopping here for now. You can read up the rest of the article on medium, by following the article source link above. And in the meantime, don’t forget to tell us what you think about seld driving cars in the comments section below. Share this article too, and subscribe to our newsletter.