MIT-AVT: Data Collection Device (for Large-Scale Semi-Autonomous Driving)
HVx9bwiMWGQ • 2018-04-09
Transcript preview
Open
Kind: captions Language: en the MIT autonomous vehicle technology study is all about collecting large amounts of naturalistic driving data behind that data collection is this box right here that Dan is term writer a Dan is behind a lot of the hardware work we do embedded systems and Michael is behind a lot of the software the data pipeline as well as just offloading the data from the device would like to tell you some of the details behind rider and behind the sensors now we have three cameras in the car and the wires are running back into the trunk and that's where the rider is sitting there's a lot of design specifications to make this system work month after month reliably across multiple vehicles across multiple weather conditions and so on at the end of the day with multiple sensor streams we have the three cameras coming in we have IMU GPS and all of the raw canned messages coming from the vehicle itself and all of that has to be collected reliably synchronized and post processed once we offload the data first we have a single board computer here running a custom version of Linux that we wrote specifically for this application this single board computer integrates all of the cameras all the sensors GPS can IMU and offloads it all on to the solid-state hard drive that we have on board there are some extra components here for cellular communication as well as power management throughout the device here we have our single board computer as well as sensor integration and our power system this is our solid-state drive that connects directly to our single board computer on our single board computer we have a sensory integration board on top here you'll be able to see our real-time clock as well as its battery backup and can transceiver on the reverse side of this board we have our GPS receiver an IMU this is our can't control power board which monitors can throughout the car and determines whether or not the system should be on or off when the system is on this sends power through a buck converter to reduce the 12 volts from the vehicle down to 5 volts to operate the single board computer we also have a 4G wireless connection on board to monitor the health of rider and determine things like free capacity left on our dry as well as temperature and power usage information the cameras connect to Ryder through this USB hub right here so we needed the box to do at least three things one was record from at least three cameras record can vehicle telemetry data and then lastly be able to store all this data onboard for a long period of time such that people could drive around for months without having us to offload the data from their vehicles and so when we're talking about hundreds of thousands of miles of worth the data so for about every hundred thousand miles uncompressed that's about a hundred petabytes of video data so one of the key other requirements was how to store all this data both on the device and how to be able to then offload us successfully onto thousands of machines to be then processed with the computer vision with a deep learning algorithms that we're using and one of the essential elements for that was to do compression onboard so these are Logitech c920 webcam they can do up to 1080p at 30 frames a second the major reason why we went with these is because they do onboard h.264 compression of the video so that allows us to offload all the processing from our single board computer onto these individual cameras allowing us to use a very slim pared-down lightweight single board computer to run all of these sensors this is the original Logitech c920 that you would buy at a store these are the two same Logitech c920 s although they were put into a custom-made camera case just for this application what this allows us to do is at our own C s type lenses to enable us to have a zoom lens as well as a fisheye lens from within the car allowing us a greater range of field of views inside the vehicle so this is the fisheye lens this is the zoom lens and the CS type there's also C type those are types of standard lenses that are connect to these types of cameras often to the industrial cameras that are often used for our Thomas vehicle applications we tested these cameras to see what would happen to them if placed inside of a a hot car and it's um on a summer day we wanted to see what these cameras still be able to hold up to this to the heat in the summer and still function as needed we put these cameras in a toaster a scientific toaster what was the temperature that went up to we cycled these cameras between 58 and 75 degrees Celsius which is about the maximum of a hundred and fifty degree Fahrenheit max temperature that a car would get in in the summer we also cranked it up to 127 degrees Celsius just to see what would happen to these cameras after prolonged long-term high heat in fact these cameras continued to work perfectly fine after that creating a system that would intelligently and autonomously turn off and on to start and end recording was also a key aspect to this device since people were just going to be driving their normal cars we couldn't rely on them necessarily to start and end recording so this device rider intelligently figures out when the car is running and when it's off to start and stop recording automatically so how does writers specifically know when to turn on so we use can to determine when the system should turn off and on when can is active the car is running and we should turn the system on when can is inactive we should turn the system off and end recording this also gives us the ability to trigger on certain can messages so for instance if we want to start recording as soon as they approach the car and unlock the door we can do that or if they turn the car on or they put it into drive or so on the cost of the car that the system resides in is about a thousand times more than the system itself so these a hundred thousand plus dollar cars so we'll have to make sure that we design the system we'll run the wires in such a way that doesn't do any damage to the vehicles what kind of things fail when they fail the biggest issue we've had with the system our camera cables becoming unplugged so when a camera cable becomes unplugged the system will try to restart that subsystem multiple times and if it's unable to it completely shuts off recording and as long as that cable is still unplugged writer will not start up the next so one issue that we've seen is that cables becoming a plugged causes us to lose the potential to record some data and that was one of the requirements of the system from the very beginning is that all the video streams are always recorded perfectly and synchronized now if any of the systems are failing to be recording from the sensors that we try again restart the system restart the system and if it's still not working it should shut down so the video in order to understand what drivers are doing these systems the video is essential so if one of the cameras is not working that means the system that's not working as a whole the other crucial component of having a data collection system that's taking the multiple streams is that those streams have to be synchronized perfectly synchronization was the highest priority from the very beginning of writers design we have a real-time clock onboard writer that allows us down to two parts per million of accuracy and time stamping this means over the course of a one and a half hour drive our time stamps issue to each of the different subsystems may drift up to seven or so milliseconds relatively this is extremely small compared to most clocks on computers today and once the data is offloaded the very first thing we do is make sure that the time stamping that the data was time stamp correctly so that we can synchronize it and the very first thing is part of the data pipeline would do is synchronize the data that means using the time stamp that came from the real-time clock that was assigned to every single piece of sensor data using that time stamp to align the data together now for video that means 30 frames a second perfectly aligned with other GPS signals and so on there are some other sensors like I am you and the can messages coming from the car that come much more frequently than 30 Hertz 30 frames a second so we have a different synchronization scheme there but overall synchronization from the very beginning of the design of the hardware to the very end of the design of the software pipeline is crucial because we want to be able to analyze what people are doing in these semi autonomous vehicles how they're interacting with the technology and that means using data that comes from the face camera the body camera the forward view synchronized together with a GPS that I'm you and all the messages coming from the vehicle telemetry from camp the video stream compression which is a very much CPU or GPU intensive operations performed onboard the camera there are other CPU intensive operation performed on Ryder like the sense of fusion for IMU but for the most part there's sufficient CPU cycles left for the actual data collection to not have any skips or drifts in the census stream collection one of the questions we get is how do we get the data from this box to our computers then to the cluster that's doing the compute so when we receive a hard drive from one of these Ryder boxes that we're swapping we connect the hard drive locally to our computers and then we do a remote copy to a server that contains all of our data we then check the data for consistency and perform any fixes and the raw data in preparation for a synchronization operation so we're not doing any remote offloading of data so the data lives on Ryder until the subjects the drivers the owners of the car come back to us and offload the data so we take the hard drive swap it out and aweful the data from the hard drive can you tell me this the journey that a pixel takes on its way from the camera to our cluster well first the camera records the raw image data based on these settings that we've configured from the Ryder box and that raw image data is compressed on the camera itself into an h.264 come format and then transmitted over the USB wire to the single board computer on the Ryder box then it's recorded on to the solid-state drive in a video file where it will stay until we do an offload in the course of about six months for rnds subjects and in one month for 50 subjects after that it is connected to a local computer synchronized within a remote server and is then processed with initial cleaning algorithms in order to remove any corrupt data or to fix any subject data in the configuration files for that particular trip after the initial cleaning is taken care of it is synchronized at 30 frames per second and can then be used for different detection algorithms or manual annotation so the important hard work behind the magic that deep learning computer vision unlocks is the synchronization the cleaning of the messy data making sure we get anything that's at all weird in any way in the in the data out so that at the end of the pipeline we have a clean data set of multiple sensor streams perfectly synchronized that we can then use for both analysis and for annotation so that we can improve the neural network models used for the various detection tasks so writers done an amazing job over 30 vehicles of collecting hundreds of thousands of miles worth of data billions of video frames so we're talking about an incredible amount of data all compressed with h.264 that's close to 300 terabytes worth of data but of course you can always improve so so what our next steps one huge improvement for writer would be transitioning to another single board computer in particular a Jetson tx2 there's a lot more capability for added sensors as well as much more compute power and even the possibility for developing some real-time systems with a Jetson one of the critical things when you're collecting huge amounts of data and driving is you realize that most of driving is quite boring nothing interesting in terms of understanding driver behavior or training computer vision models for edge cases and so on nothing interesting happens so one of the future steps we're taking is based on the thing we found in the data so far we know which parts are interesting which are not and so when a design on board algorithms that are processing in real time that video data Herman is this the kind of data I want to keep it this time and if not throw it out that means we can collect more efficiently just the bits that are interesting for edge case neural network model training or for understanding human behavior now this is a totally unknown open area because really we don't understand what people do and send me a time with vehicles when the car is driving itself and the human is driving itself so the initial stages of the study were to keep all the data so we can do the analysis to analyze the body pose glance allocation activity smartphone usage all the various Sun decelerations autopilot usage where it's used how it's used geographic weather night so on but as we start to understand where the fundamental insights come from we can start to be more and more selective about which epochs of data we want to be collecting now that requires real time processing of the data and as Dan said that's where the justin tx2 the power that the justin takes to brings is becomes more and more useful now all of this work is part of the MIT autonomous vehicle technology study we've collected over three hundred twenty thousand miles so far and collecting five hundred to a thousand miles every day so we're always growing adding new vehicles we're working at adding a Tesla Model 3 a Cadillac ct-6 super cruise system and others one of the driving principles behind our work is that the kind of data collection we need to design safe semi autonomous and autonomous vehicles is that we need to record not just the forward roadway or any kind of sensor collection on the external environment we need to have rich sensor information about the internal environment what the driver is doing everything about their face the glance all the cognitive load and body pose everything about their activity we truly believe that autonomy autonomous vehicles require an understanding of how human supervisors of those systems behave how we can keep them attentive keep their glance on the road keep them as effective efficient supervisors of those systems you
Resume
Categories