Docent-robot (Greggg)

Docent-robot (Greggg)
By: Ryan Caldwell, Jessica Seibert, Matthew Wherry

Outline Concept Overview of Fall 2016 development
Individual partner responsibilities Future Work Conclusion Q/A Session

What Is a docent? “Docent” is a title used in the United States for educators trained to further the public's understanding of the cultural and historical collections of an institution, including local and national museums, planetariums, zoos, historical landmarks, and parks.

Fall 2016 development overview
In Fall 2016, we achieved our goal of sending/receiving hardware data and issuing motor commands via serial communication Although effective, the result was not aesthetically pleasing Movement algorithms were very primitive, robot was not controllable outside of simple commands

Spring 2017 development goals
In order to make the robot more aesthetically pleasing, eliminate the serial communication and replace it with Bluetooth Improve visual analysis using an Xbox 360 Kinect Modify the code to take advantage of object oriented principles and redesign using software patterns Modify the code to meet better programming practices

Hardware adaptations Arduino  Raspberry Pi 3 Model B
Bluetooth capabilities This seemingly simple hardware adaptation resulted in a complete hardware redesign

Bluetooth Communication
Goal: Create a package with classes that will be able to send/receive data with the hardware via Bluetooth Main hardware components: Sonar Sensors Motor manipulation GPS unit Compass unit

Bluetooth communication process

Requesting data with the observer pattern

Dashboard as an observer
Dashboard registers itself as an observer Displays the sensor data on the user interface Displays what location Greggg has found visually.

Low Level Hardware Raspberry Pi PAM-7Q GPS HMC5883L Compass
Ping))) Sonar Sensors DHB-10 Motor Control Board All group members

Connections between hardware and pi
PAM-7Q GPs Uses a serial stream. HMC5883L Compass Uses an I2C connection Sonar Sensors Uses GPIO (General purpose input/output) DHB-10 Motor Control Uses a serial stream

Low Level Software Python Controllers for hardware
Each controller is threaded Allows ability to gather data from everything at once Bluetooth Controller Controls and allows connections

Goals of vision Enable Greggg to recognize destinations or important objects in a scene Mark waypoints on a tour using this visual capability Replace desire to use QR codes when identifying objects or waypoints with something functionally equivalent Let it still use QR codes if you want (strategy pattern).

Gregggs approach: machine learning
Learn what a sign looks like, what features differentiate signs In the same manner, learn what objects might look like in a museum or other venue In the same manner, learn what building facades look like Challenges: Be robust, don’t unintentionally learn things that might change: unrelated signs around doors, people, weather, light reflections, objects only in a specific position. Avoid having to make hard coded changes in order to accomplish new goals, such as touring multiple settings or for different purposes. It has to actually work! Utilize consumer hardware, not expensive LIDAR… Vision by consumer cameras

Gregggs approach: convolutional neural networks
A closer look: The area connected to a neuron on second layer reaches only 3 inputs, this is called the “receptive field”. Weights are shared, performing a strided convolution across an input. This sort of neuronal connectivity mimics the mammal retina. Mathematically, can also be viewed as learning Image Kernels, aka Convolutions, which can simplify the work of using Gaussian filters or calculus operations such as getting the Laplacian.

What does that mean? The neural network can do the same thing as the GIMP image editor, but for the purpose of recognizing objects. From GIMP: 3x3 Laplacian functions as an edge detector, but allows noticeable noise. Demonstrated on the green channel

What does that mean? The neural network can do the same thing as the GIMP image editor, but for the purpose of recognizing objects. From GIMP: 5x5 gaussian – gaussian blur. Demonstrated on the green channel.

What does that mean? The neural network can do the same thing as the GIMP image editor, but for the purpose of recognizing objects. From GIMP: Laplacian of Gaussian (slightly modified to remove noise further) Demonstrated on the green channel.

No hardcoded solution for each problem
The Laplacian of a Gaussian is a common technique in edge detection in computer vision. The Gaussian denoises while the Laplacian highlights change (thank you calculus) However, we want to get the results of a neural network (learning to associate inputs with outputs), while keeping the benefits of the convenient functionality of convolutions. So while we could hard code in filters just as GIMP does it, we prefer to let our neural network learn what kind of convolutions will work well for its purpose, when to use them, how to use them, and for what purpose. Greggg’s neural network has 36 layers of convolutions, each layer can create an aggregate of learned information… Features learned early on can be reused later in the network. So we have convolutions of convolutions of convolutions, just as a neural network makes a hypothesis based on a hypothesis. This is how it learns complex internal relationships and representations of an image, or other data. So the network can now learn our solutions for us, without us hardcoding information about images. But nothing is free… Neural Networks are not magic, yet.

Making the network robust (and actually work)
Long story short, the network will learn anything that is made available to it, including things we might not expect (changing in lighting) In order to make the network robust, there are mathematic techniques we can use (regularization) but they only help us out. A neural network is only as good as its data. So we need to take a wide, varying sample of the things we want to detect. Take images with different cameras Multiple resolutions Multiple times of day With different camera settings With obstructions over irrelevant parts of an image Many angles over which you would expect the network to identify something

With multiple affordable cameras

William All group members

Our 2016 future work goals Visual analysis with Xbox 360 Kinect - Done
Speech-To-Text Analysis Hardware Improvements - Done Aesthetic Improvements to the robot - Done New movement algorithms Route optimization – Zeus Log to a Database Visually show Gregggs path - Done All group members

Conclusion We completed many of our goals from Fall 2016
Migrated to Bluetooth communication from serial communication Opens the door for interesting implementations utilizing Bluetooth Controllable movement via Windows compatible joysticks: Xbox 360 controller, Playstation 3 All group members

Future work Speech-To-Text Analysis Run completely autonomous tours
Utilize Bluetooth in other interesting ways Continue working on publishing a paper on our work Multiple clients connected to server (mobile app support) All group members

Questions? All group members

Docent-robot (Greggg)

Similar presentations

Presentation on theme: "Docent-robot (Greggg)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Docent-robot (Greggg)

Similar presentations

Presentation on theme: "Docent-robot (Greggg)"— Presentation transcript:

Similar presentations

About project

Feedback