Machine Learning and Statistical Analysis on Massive Biomedical Data
While recent advancements in data acquisition technologies have enabled us to collect and store data of unprecedented size and diversity, it have also introduced new computational and statistical challenges, such as algorithm scalability, multiple comparison problems, and data contamination.
In order to analyze and make any informed decision from such data, it is imperative to design specialized statistical tools and softwares that are scalable, robust, and capable of extracting meaningful information in the presence of data corruption.
My research specialty lies in developing machine learning and statistical tools for analyzing large-scale neuroimaging data (eg, diffusion and functional MRI, CT) with applications to brain disorders such as schizophrenia, autism, and ADHD. As many traditional statistical methods break down in such setting, the goal of my research is to: (1) develop specialized machine learning methods that leverage potential structure in the data, which can improve predictive power and model interpretability, and (2) design and implement scalable (numerical) optimization algorithms to address data dimensionality.
PhD Dissertation (University of Michigan, Ann Arbor)
- "Scalable Machine Learning Methods for Massive Biomedical Data Analysis" [ pdf]
Presentation on Machine Learning (Japanese・日本語)I recently had the pleasure of giving a talk to the Philadelphia Japanese Research Community at the University of Pennsylvania about Machine Learning: 「マシーン・ラーニングってつまり何？」 [ pdf, pptx ]
Research Projects and Portfolio
Below are some of the research projects I have worked on during my academic career.
Discriminative Subnetwork Extraction using Supervised Non-negative Matrix Factorization
Disease Classification using resting-state-fMRI based Connectomes
Multisite Disease Classification using Multitask Structured Sparse SVM
Uncertainty Analysis for Biomedical Image Registration
Analysis of Gender Differences in Structural Connectivity
- T. Watanabe, B. Tunc, D. Parker, J. Kim, R. Verma, "Label-Informed Non-negative Matrix Factorization with Manifold Regularization for Discriminative Subnetwork Detection,'' Medical Image Computing & Computer Assisted Intervention, 2016.
- T. Watanabe, D. Kessler, C. Scott, C. Sripada, "Multisite Disease Classification with Functional Connectomes via Multitask Structured Sparse SVM,'' Sparsity Techniques in Medical Imaging (STMI), Boston, USA, 2014.[Author preprint (.pdf)]
- T. Watanabe, D. Kessler, C. Scott, M. Angstadt, C. Sripada, "Disease Prediction based on Functional Connectomes using a Scalable and Spatially-Informed Support Vector Machine,'' NeuroImage, vol. 96, no. 4, pp. 183-202, 2014. (code) [arXiv]
- T. Watanabe, C. Scott, D. Kessler, M. Angstadt, C. Sripada, "Scalable Fused Lasso SVM for Connectome-based Disease Classification,'' IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Florence, Italy, 2014.
- C. Sripada, D. Kessler, T. Watanabe, R. Welsh, Y. Fang, M. Angstadt, S. Taylor, C. Scott, "Whole-brain">Whole-brain connectomic analysis of 145 resting state scans reveals network neurosignatures of schizophrenia,'' Biological Psychiatry, vol. 73, 2013.
- T. Watanabe and C. Scott, "Spatial Confidence Regions for Quantifying and Visualizing Registration Uncertainty," Biomedical Image Registration, vol. 7359, pp. 120-130, 2012. [Author preprint (.pdf)]
- Medical Image Computing & Computer Assisted Intervention (MICCAI), Athens, Greece (October 2016).
- International Workshop on Sparsity Techniques in Medical Imaging, MIT, Cambridge, MA, September 14, 2014.
- Workshop on Sensing and Analysis of High-Dimensional Data, Duke University, Durham, NC, July 23-25, 2013.
- Workshop on Biomedical Image Registration, Vanderbilt University, Nashville, TN, July 7-8, 2012.
- Michigan Student Symp. for Interdisciplinary Statistical Sciences, University of Michigan, Ann Arbor, MI, April 6, 2012.
Research NotesSome snapshots of my research notes to give an idea of the way I think, and the typical ideation process in my research.
Other old notes:
- brainstorming... [pdf]
- Notes prepared for Machine Learning Reading Group [pdf1, pdf2, pdf3, pdf4]
- (more to come...)
Personal Coding Notes
- Data science notebook (link) [Github source]
- Snippets notebook (link) [Github source]
- Coding notebook (link) [Github source]
- Computer configuration notebook (link) [Github source]
- Personal PySpark API (link) — I created this since I found the official API hard to navigate
- Interactive Charts with Plotly — some examples: [Plotly-demo1] [Plotly-demo2] [Plotly-demo3]
- See a more detailed version available at takwatanabe.me/data_science