NSF Research Traineeship (NRT) 2016-17 : Graduate Program : Goergen Institute for Data Science and Artificial Intelligence : University of Rochester

NSF Research Traineeship (NRT) 2016-17

Students participated in a one semester course consisting of three one month long modules, entitled "Methods In Data Enabled Research Into Human Behavior And Its Cognitive And Neural Mechanisms". The modular course provided an innovative solution to the problem of providing cross training while not overburdening trainees with additional coursework. Each module immediately engaged students in hands-on experience with tools and methods of computer or cognitive science. We developed a set of modules over the course of the NRT, rotating some of the modules in and out each year. This year's modules were:

Deep learning taught by Professor Henry Kautz. This module introduced students to deep learning, that is, the use of multilayer artificial neural networks, with applications to perception and natural language processing. The use of advanced toolkits such as Theano and Caffe made it possible for students to progress in just four weeks from basic theory to building deep learning systems that solved challenging problems in image and text classification.
Methods in psycholinguistics taught by Professor Florian Jaeger. This module provides a hands-on tutorial in experimental techniques for inferring human mental processing from fine grained measurement of responses to auditory or visual stimuli in the laboratory. Methods include eye tracking, speech perception experiments, acuity tests, and iterated artificial language learning. In the last week of the module, students used these methods to test hypotheses about how word choice is realized in language production.
Computer vision taught by Professor Jiebo Luo. Students learned core concepts in computer vision through a series of hands-on exercises, from basic image processing to deep learning methods, culminating in the creation of a system for generating English language captions for arbitrary images.

The following semester, the students participated in "Practicum In Data Enabled Research Into Human Behavior And Its Cognitive And Neural Mechanisms". In this course, trainees worked in mixed teams of computer science and brain and cognitive sciences PhD students to develop an artifact that addressed a research question and/or infrastructure need. The team learned principles of design by participating in the stages of brainstorming, specification, initial design, prototyping, refinement, and evaluation. The course was led by Professor Ehsan Hoque, and the teams worked on two projects: multimodal speech recognition and diagnosis of aphasic patients.

2016-17 Practicum Projects

Multimodal Speech Recognition

Our goal is to improve speech recognition by using video as well as audio. A lot of speech recognition software relies solely on auditory processing, yet humans may use additional cues to aid their understanding - context, lip-reading, demographic information.

We are hoping to make a neural network model that takes advantage of these additional sources of information.

We will compare model performance to human performance and identify what kinds of video features are most useful for improving speech recognition in both our model and humans. We are not aware of any previous academic work that specifically investigates the usefulness of demographic information paired with audio-visual speech recognition.

Furthermore, while some areas of psycholinguistics have investigated the influence of demographic information such as race on speech perception, no studies have directly investigated the effect of demographic information on the accuracy of word/sentence recognition. Comparing what kinds of demographic information humans vs. our model use to optimize speech recognition is also theoretically important for cognitive models of behavior.

Project Website

Diagnosis of Aphasic Patients

Aphasia is a condition resulting from brain damage that can cause difficulty in the production or comprehension of speech. There are several different subtypes of aphasia, including Broca’s aphasia (which involves difficulty in the production of speech but relatively intact comprehension), and Wernicke’s aphasia (which involves relatively fluent speech but difficulty with comprehension). We propose to design a computer model that can classify aphasic speech in the auditory and text domains.

Research question: What are the fundamental patterns present in aphasic speech that allow humans to subjectively detect the presence of aphasia? How important is each aspect of traditionally aphasic speech in the diagnosis of aphasia? Can we unveil novel characteristics of aphasic speech that can inform future diagnoses? If we build a model that can (to some degree) “diagnose” aphasia, we can use the weights from the model to assess the relative importance of different speech characteristics in diagnosis.

Project Website

Bios of the 2016-17 Class Members

	Wednesday Bushong I'm a second year PhD student in the Brain and Cognitive Sciences department advised by Florian Jaeger. I use a combination of computational and experimental methods to study how people learn the statistics of their language and strategically use this knowledge during real-time processing. Website: https://wbushong.github.io/ Email: bushong@hartford.edu Project contributions: Our group of brain and cognitive sciences and computer science students investigated whether introducing visual information about the face aids in speech perception in humans and neural networks. I contributed to the experimental side, designing and running experiments to test how visual information changes speech perception in humans and comparing this to how visual information aided speech perception in neural networks. What did I learn from the course? The biggest takeaway for me was the ability to collaborate, especially with people who have very different skillsets. How might this course affect my career? As I mentioned, I felt that teamwork and collaboration skills were the main benefits of the course and they will have a very positive impact on my career as I collaborate more and more with researchers outside of my subfield. The computational skills I learned from the first part of the course are also applicable to my research and to current industry standards if I decide to leave academia. UPDATE June 2020: Wednesday Bushong completed her PhD and will be starting a tenure-track faculty position at University of Hartford as an Assistant Professor of Psychology.
	Benjamin Chernoff Ben is a second year brain and cognitive sciences PhD student working with Brad Mahon. He graduated from George Washington University in 2014 with a BA in psychology and a minor in statistics. He is interested in neurobiological models of speech production. In particular, he is interested in the causal role of inferior frontal gyrus connectivity for sentence production. To examine this, he uses functional and structural MRI as well as neuropsychological assessment to study neurosurgery patients longitudinally. Current CV Email :bchernof@ur.rochester.edu Project contributions: I was the project manager for my group- Diagnosis of Aphasia, Rochester. I helped come up with the original idea for the project and was able to help frame the question based on my background in aphasia. I also acquired access to the database of patients we used, and I built the website for our project. What did I learn from the course? I learned important fundamental concepts of machine learning, ways that those concepts have been implemented in cognitive science generally, and how to use programming to implement them in my own research. How might this course affect my career? Machine learning is an emerging tool in my field and within neuroimaging in general. This course furthered my career by introducing machine learning concepts in a way that I can apply to my own research. This increases my skillset as a researcher, which will help create opportunities for me after graduation. Conferences since the class ended: Annual Meeting of the Cognitive Neuroscience Society, 2018
	Sam Cheyette I graduated from Carnegie Mellon in 2016 with a B.S. in cognitive science. I'm now a brain and cognitive sciences PhD student advised by Steve Piantadosi, studying numerical cognition and conceptual development. I participated in NRT my 1st & 2nd year (2016-2018). Website CV Project contributions: I was involved in numerous projects, but my favorite was probably developing a tool to integrate speech with lip-reading. At first, I was involved in developing the tool (a deep recurrent neural network), but my primary focus became figuring out how well non-deaf people can integrate speech with lip-reading. What I learned: On a "factual" level, I learned a lot of useful machine learning methods, including how to train large deep neural networks using publicly available tools (and other people's pre-trained networks); and useful statistical techniques, such as how to infer people's priors in hierarchical Bayesian models. But, unique to this class, I learned how to work effectively in a team on larger-scale projects requiring the implementation of multiple co-dependent pieces. How it might affect my career: Many of the machine learning tools that I learned in this class are widely used or becoming widely used in my field. Having some experience with those tools will certainly be useful. Greatest challenge: Working with many other people towards an ambitious goal can be frustrating (and rewarding). It's tough trying to keep everyone in a team on the same page and motivated toward a common purpose. Update Since Traineeship: Sam is finishing his PhD at the University of California at Berkeley to continue working with Steven Piantadosi.
	Brian Dickinson I am a second year computer science PhD student advised by Henry Kautz. My research interests are in applications of artificial intelligence, particularly in data science, for social good. Prior to coming to Rochester I completed a double major in business administration and computer science at Houghton College. Website Resume Project Contribution: Our group's project focused on separately comparing how humans and our trained artificial neural networks used visual and audio information in speech perception. One subgroup ran experiments on how well people could understand speech with and without visual cues while the other group focused on training a neural network to perform the same task. In particular, I designed and trained a neural network which extracted demographic data such as age, gender, and race from the speaker videos. What did I learn from this course? / What was surprising or the greatest challenge? My biggest takeaway and greatest challenge was cross-disciplinary group collaboration. This was the first time I worked on a large project where no one person fully understood every component of the project. It was an excellent exercise in giving clear, concise descriptions of the parts of your work that everyone needed to understand. How might this course affect my career? The most immediate impact for me was having worked on a practical project using neural networks rather than simply working on classwork examples. This is immediately useful because interviewers are far more interested in you if you can apply what you know to real-world problems and data sets.
	Weilun Ding Weilun Ding is an MS student in data science with a concentration in brain and cognitive sciences at the University of Rochester. He is at the stage of preparing for a PhD study in neuroeconomics, which is an intersection of his undergraduate study in economics and the graduate work he is currently focusing on. Weilun’s interests are mainly in behaviors and neural mechanisms of economic decision-making and computational modeling of such processes, with an emphasis on the algorithmic circuit underlying human value-computation systems. He is open to methodologies of psychophysics, single-unit recording and neuroimaging, together with co-validation from artificial intelligence.
	Natalia Galant Natalia Galant is an MS in data science. Her undergraduate degree and past research experience were in neuroscience. She is interested in integrating these two fields in an interdisciplinary way. Updates since training: Natalia is a Quantitative Strategist working for Variant Perception.
	Carol Jew Carol Jew is a third year graduate student in the department of Brain and Cognitive Sciences at the University of Rochester. She is working with her advisor Rajeev Raizada to investigate how the brain is able to learn about new objects and recognize familiar objects effortlessly. She received her undergraduate education from New York University in 2012. Website Email: carol.jew@rochester.edu Project contributions: My main contribution to my group's project, Diagnostic of Aphasic Patients, Rochester (DOAP ROC) was cleaning and parsing the transcriptions of the video recordings of the participants’ performance on the diagnostic evaluations and labeling each word in an utterance according to its part of speech. We then compared the utterances produced by participants with aphasia and those produced by control participants and looked for differences that characterized speech production across the two groups. We also passed these features along to our teammates for use in training and testing aphasia classification models. What did I learn from the course? I was introduced to Theano and how to use it to build neural networks. I learned a great deal about neural networks and how to apply them to different research areas. I learned about what questions an industry finds interesting vs what questions an academic lab finds interesting. I learned how to work with a team to develop a project from its inception to its presentation to a public audience. How might this course affect my career? I learned how my background as a cognitive scientist can be leveraged to transition into a career in data science. What was surprising or the greatest challenge? The greatest challenge was to figure out how to keep everyone involved and focused on the main research question at hand. While we each wanted to work on individual aspects of DOAP ROC that interested us the most, we had to remain a cohesive unit that worked interactively.
	Nathan Kent Nathan Kent is a second year PhD student in the Computer Science department at the University of Rochester. He is a member of the Robotics and Artificial Intelligence Laboratory under Thomas Howard. His major interest is in epigenetic robotics. He received his BS in computer engineering from Iowa State University. Website Email: nate@nkent.net OR nkent2@cs.rochester.edu What did I learn from the course? The things I found to be the most interesting were the subtle differences between doing research as a student in brain and cognitive sciences and doing research as a student in computer science. The final project was an interesting exercise in crossing that gap. How might this course affect my career? This course provided me with the opportunity to make contacts in another department that I may not have made otherwise. Hopefully, I will be able to use this experience to work with the BCS department in the future. What was surprising or the greatest challenge? The writing workshops. The difference between the workshops in the first year and the second year seemed like night and day. I did not expect them to be as useful and interesting as they were. Updates: Since the class ended, I have attended the Epirob 2017 and RSS 2018 conferences.
	Priyanka Mehta Priyanka Mehta is a first year PhD student in the Brain and Cognitive Sciences department working with Ben Hayden. She has a BS in cognitive science from the University of California – Los Angeles. She uses electrophysiological methods to study neuronal activity during decision making and is interested in the neural correlates of reward value and risky choice. Update Since Traineeship: Priyanka is finishing her PhD studies at the University of Minnesota at the Hayden Lab.
	Parker Riley Parker is a first-year computer science PhD student, working on natural language processing with Daniel Gildea. His primary research interests are in machine translation and multilinguality, with a focus on unsupervised and semi-supervised methods. He received his BSc in computer science from the University of Kansas. Resume Project Contributions: Parker primarily worked on the face detection and facial feature extraction components. What I learned: "This course was an excellent way to form interdisciplinary connections and get a broader perspective on research in the field." Career impact: "With the recent boom in deep learning, getting familiar with it early on in the program is an excellent way to be set up for success after graduation." Surprise/challenge: "I was most surprised to see how much work there is happening in other fields that can be applied to my own; we need programs like this to increase cross-disciplinary communication." Updates since participation: I attended ACL 2017, presented published work at ACL 2018 (both trips with support from the NRT), and interned at Google in New York City.
	Sudhanshu Srivastava Sudhanshu is a first year MS student in computer science. His research interest was Human and Computer Vision. His undergrad degree was in mathematics. Website CV Email: sudhanshusri1001@gmail.com Project Contributions: Our team worked on Automated Diagnosis of Aphasic Patients using Deep Learning. I implemented multiple Convolutional Neural Networks and LSTM models as well as tuned the hyperparameters for these. What did I learn from the course: In the first semester, I learned the basics of Deep Learning and then I learned its applications to computer vision and language. I also learned the basics of psycholinguistics, Bayesian Inference and how these are used in cognitive science research and crowdsourcing through mechanical turk. In the second semester, I applied what I had learned in the first semester to a real-world problem. How might this affect my career: The NRT already immensely helped my career. I had an interest in both artificial intelligence and neuroscience and I had little background in either before joining. This fall I begin a PhD in dynamical neuroscience and will be working at the intersection of AI and neuroscience. I also met Prof Kautz at the NRT and later worked in his lab. This work, along with what I learned at the NRT helped me get another project, which led to a paper and a workshop presentation. What was surprising or the greatest challenge: I was pleasantly surprised to find that Bayesian Inference has an active application in cognitive science. Updates since the class ended: Publications: Ryan Wolf J, Srivastava S, Frank K, Pentland AP, Pentland BT. “Healthcare Routines as Action Networks”. XXXVIII Sunbelt 2018 Conference: Network Sciences, June 26-July 1, 2018, Utrecht Netherlands. L Chen, S Srivastava, Z Duan and C Xu “Deep Cross-Modal Audio-Visual Generation”. ACMMM Thematic Workshop, Mountain View, CA, 2017. (* denotes equal contribution.) Research Assistant Position at URMC as a part of an NSF grant. Accepted for a PhD in Dynamical Neuroscience at the University of California at Santa Barbara.
	Sharong Yan Sharong Yan is a third year graduate student. Before the University of Rochester, Yan was at the University of Iowa where he earned his master in cognitive psychology and he did his undergrad at Peking University. Yan worked with Dr. T. Florian Jaeger and Dr. Mike Tanenhaus. In his research, his biggest interest is to study how people use contextual information to guide their language processing and how they rapidly update their expectations based on recent experience in the communication context and with the talker. Personal website Current resume Email: teaudioego@gmail.com Project contributions: I was primarily responsible for extracting prosodic information from the videos and model evaluation (analyzing model prediction errors and how that informs us what knowledge the model has). What did I learn from the course? I was exposed to different methodologies (DNNs, Bayesian models) and had hands-on experience in applying them to analyze noisy, real-world data. This is very different from what I am used to as a cognitive science researcher where most data we acquire is collected from well-controlled experiments. How might this course affect my career? The course has provided me with new perspectives that led me to consider pursuing a career as a data scientist. The exposure to state-of-art methodologies fed me with new ways of thinking about how people represent complex knowledge. I am excited by the opportunity of combining these data science methods with my knowledge in psycholinguistics to better understand the dynamics underlying human language processing. Conference attendance since taking the course: Oral: Yan, S., Mollica, F., and Tanenhaus, M. (2018). A context constructivist account of contextual diversity. Talk presented at the 40th Conference of the Cognitive Science Society, Madison, Wisconsin. Poster: Yan, S., and Jaeger, T. F. (2018). Comparing models of unsupervised adaptation in speech perception. Poster to be presented at the AMLaP 2018, Berlin, Germany. Yan, S., Kuperberg, G. R., and Jaeger, T. F. (2017). Bayesian surprise during incremental anticipatory processing: a re-analysis of Nieuwland et al. (2017), based on DeLong et al. (2005). Poster presented at the 9th Annual Meeting of the Society for the Neurobiology of Language, Baltimore, MD.