· 5 years ago · Jun 01, 2020, 08:22 AM
1\documentclass[12pt]{article}
2\usepackage[utf8]{inputenc}
3\usepackage[english]{babel}
4\usepackage{fullpage}
5\usepackage{graphicx}
6\usepackage{hyperref}
7\begin{document}
8\begin{titlepage}
9\newcommand{\HRule}{\rule{\linewidth}{0.5mm}} % Defines a new command for the horizontal lines, change thickness here
10\center % Center everything on the page
11
12%----------------------------------------------------------------------------------------
13% HEADING SECTIONS
14%----------------------------------------------------------------------------------------
15\textsc{\LARGE Uppsala University}\\[1.5cm] % Name of your university/college
16\textsc{\Large Master Thesis }\\[0.5cm] % Major heading such as course name
17\textsc{\large Specification}\\[0.5cm] % Minor heading such as course title
18
19%----------------------------------------------------------------------------------------
20% TITLE SECTION
21%----------------------------------------------------------------------------------------
22
23\HRule \\[0.4cm]
24{ \huge \bfseries Exploring Machine Learning Architectures to Help Combat COVID-19 }\\[0.4cm] % Title of your document
25\HRule \\[1.5cm]
26
27%----------------------------------------------------------------------------------------
28% AUTHOR SECTION
29%----------------------------------------------------------------------------------------
30
31\begin{minipage}{0.4\textwidth}
32\begin{flushleft} \large
33\emph{Author:}\\
34Christian Ormos % Your name
35\end{flushleft}
36\end{minipage}
37~
38\begin{minipage}{0.4\textwidth}
39\begin{flushright} \large
40\emph{Supervisor:} \\
41Rickard Jeppsson % Supervisor's Name
42\end{flushright}
43\end{minipage}\\[2cm]
44
45% If you don't want a supervisor, uncomment the two lines below and remove the section above
46%\Large \emph{Author:}\\
47%John \textsc{Smith}\\[3cm] % Your name
48
49%----------------------------------------------------------------------------------------
50% DATE SECTION
51%----------------------------------------------------------------------------------------
52
53{\large \today}\\[2cm] % Date, change the \today to a set date if you want to be precise
54
55%----------------------------------------------------------------------------------------
56% LOGO SECTION
57%----------------------------------------------------------------------------------------
58\center
59\includegraphics[scale=0.4]{uppsala.jpg} % Include a department/university logo - this will require the graphicx package
60
61%----------------------------------------------------------------------------------------
62
63
64\end{titlepage}
65
66
67
68\title{Master Thesis specification}
69\author{Christian Ormos}
70
71\pagebreak
72
73\section{Background}
74This project is aimed to explore, and make use of, the existing datasets related to COVID-19. Today there exists a number of initiatives from companies such as Google (COVID-19 Public Datasets) \cite{Overleaf} and call to actions from governments (CORD-19) \cite{Overleaf2} \cite{Overleaf3} where both parts have made large quantities of data, ranging from radiography to scientific articles, freely available to the public.
75\newline
76\newline
77Research regarding the interlacing between COVID-19 and machine learning is, understandably, a relatively young field of study with a limited (but rapid growing) dataset. However, there has been a lot of progress in machine learning diagnostics regarding more common forms of illness such as lung emphysema, pneumonia and atelectasis \cite{Overleaf4} which lays the foundation to start exploring the possibilities of applied machine learning to combat COVID-19.
78\newline
79\newline
80I have been in contact with the company “Valtech AB” located in Stockholm, Sweden and, from them, acquired a mentor and a contact person at Danderyds sjukhus.
81
82
83\section{Problem Definition}
84At the time of writing this specification, there are 1.812.734 reported cases worldwide and 113.675 confirmed dead.\cite{Overleaf5}
85The rate of infection is still growing exponentially which is bringing health care systems to the brink of collapse.
86\newline
87\newline
88According to newly conducted research \cite{Overleaf6} \cite{Overleaf7}, the 2019 novel coronavirus do present several unique features on chest X-rays and CT-scans that distinguish it from imaging of other pulmonary diseases such as pneumonia. However, the key characteristics of a COVID-19 infection has been proven challenging to detect with the human eye \cite{Overleaf8}
89Prognostic prediction in the triage and managing patient care could be improved with the help of artificial intelligence. This thesis will shine light on the possibilities to reliably identify an infected patient solely from CT-scans and/or X-rays using different image classification techniques. The result from these techniques will be compared against each other to form an understanding which techniques suits this problem domain the best and why.
90\newline
91\newline
92
93
94\newpage
95\section{Method}
96As mentioned earlier in this specification, there exists a plethora of articles regarding COVID-19. This will, of course, demand a hefty research phase in the beginning of the project to process both the medical part of the study as well as the technical. I have a set of medical professionals as well as IT professionals at my disposal which I will interview.
97\newline
98\newline
99Questions such as “which image classification technique would make the most sense in my case” and “which relevant existing image analysis algorithms are used within the healthcare today” will arise and will have to be addressed. All this information will be compiled and serve as the background of the project. It should be stated that the majority of data available today (X-rays, CT-scans) are already labeled with patient status. Some key value pairs of interest in this study are whether the patient are infected, survived the disease and if the patient has any other ongoing pulmonary disease. Since the result of the data is available it makes sense to start exploring the possibilities of using supervised reinforcement learning.
100\newline
101\newline
102The technical part of this project will mainly be written in Python utilizing the available machine learning libraries and APIs such as, but not limited to, Tensor Flow, Keras and PyTorch.
103All mentioned technologies are backed up by a very large community and are well documented. Tensor Flow \cite{Overleaf9} needed is a highly optimized library which allows for demanding computations and will therefor be utilized in conjunction with the high-level neural networks API Keras \cite{Overleaf10} needed to prototype the models. PyTorch is another library which enables more efficient calculations via GPU acceleration. \cite{Overleaf11}
104\newline
105\newline
106During the course of the project I will continue to explore alternative data sets beyond the main sets previously discussed. A great source of high quality data sets can be found on \textit{Kaggle.com} \cite{Overleaf12}
107
108
109
110
111
112
113\section{Relevant courses}
114\begin{itemize}
115 \item Artificial Intelligence
116 \item Linear Algebra and Geometry
117 \item Probability and Statistics
118 \item Numerical Methods and Simulation
119 \item Large Datasets for Scientific Applications
120 \item Applied Cloud Computing
121 \item Software Engineering Project
122\end{itemize}
123
124\section{Limitations}
125\textbf{Disclaimer:} This thesis will not claim any kind of diagnostic performance of any created model without a clinical study. This thesis will first and foremost explore the possibility of creating a diagnostic tool with the current available data set and technologies.
126\newline
127\newline
128If the data set regarding COVID-19 X-ray images proves insufficient. Valid alternatives could be to explore different data sets and applied machine learning such as:
129
130\begin{itemize}
131 \item Risk identification: Who is at an elevated risk of getting infected and are they at risk at perishing under the disease?
132 \item Drug development: Can we develop drugs faster and more efficient using machine learning?
133 \item Disease spread prediction: Can we predict the spread of a highly contagious disease using existing data?
134\end{itemize}
135\\
136another alternative to the project would be to create a ”proof-of-concept” that show the efficiency of image analysis in health care. This will also prove that there exists a demand and that the area of research can be improved upon
137\\
138\\
139Another limitation could be to reduce to only include either x-rays or CT-scans. This is because the ratio between available x-rays an CT-scan is skewed. There exists significant more chest x-rays which means that if we include CT-scans in this study it may not be enough for our models to reliably identify them.
140
141
142
143\section{Time Plan}
144\subsection{Week 1-3, 11/5 - 5/6}
145The start of the project will focus mainly on the overall structure of the project. The research phase begins.
146
147\subsection{Week 4-7, 8/6 - 3/7}
148Continuous research within the field. I will condtuct interviews with my contact persons to get a broader understanding of what is relevant to the project.
149
150\subsection{Week 8-10, 6/7 - 31/7}
151Start to design and train the first models. This will be a continuous process since one of the goals of the project is to compare different image classification techniques.
152
153\subsection{Week 11-14, 3/8 - 28/8}
154Draw conclusive useful results from the models.
155
156\subsection{Week 15-18, 31/8 - 25/9}
157Prepare for the presentation and oppose another students presentation. Finalize the work if there still exist anything to imporve. Continue work on the report and, best case, finalize it.
158\subsection{Week 19, 28/9 - 2/10}
159Keep working on report if it is not finished.
160
161\subsection{Week 20, 5/10 - 9/10}
162Correct revised report before final submit. Hand in the report.
163\newpage
164\begin{thebibliography}{references}
165
166\bibitem{Overleaf}google.com, "About COVID-19 Public Datasets", 2020, Available at: \url{https://console.cloud.google.com/marketplace/details/bigquery-public-datasets/covid19-public-data-program?filter=solution-type:dataset&filter=category:covid19&id=7d6cc408-53c8-4485-a187-b8cb9a5c0b56}
167
168\bibitem{Overleaf2}semanticscholar.org, "CORD-19, COVID-19 Open Research Dataset", 2020, Available at: \url{https://pages.semanticscholar.org/coronavirus-research}
169
170\bibitem{Overleaf3} Ahmad Alimadadi, Sachin Aryal, Ishan Manandhar, X Patricia B. Munroe, Bina Joe. X Xi Cheng, "Artificial intelligence and machine learning to fight COVID-19", March 27, 2020, Available at: \url{https://journals-physiology-org.ezproxy.its.uu.se/doi/pdf/10.1152/physiolgenomics.00029.2020}
171
172\bibitem{Overleaf4}Amit Kumar Jaiswal, Prayag Tiwari, Sachin Kumar, Deepak Gupta, Ashish Khanna, Joel J.P.C. Rodrigues, "Identifying pneumonia in chest X-rays: A deep learning approach", 4 June, 2019, Available at: \url{https://www-sciencedirect-com.ezproxy.its.uu.se/science/article/pii/S0263224119305202}
173
174\bibitem{Overleaf5}who.int, "Rolling updates on coronavirus disease (COVID-19)", 2020, Available at: \url{https://www.who.int/emergencies/diseases/novel-coronavirus-2019}
175
176\bibitem{Overleaf6}Yicheng Fang, Huangqi Zhang, Jicheng Xie, Minjie Lin, Lingjun Ying, Peipei Pang, Wenbin Ji, "Sensitivity of Chest CT for COVID-19: Comparison to RT-PCR", February 19 2020, Available at: \url{https://pubs.rsna.org/doi/10.1148/radiol.2020200432}
177
178\bibitem{Overleaf7}Tao Ai, Zhenlu Yang, Hongyan Hou, Chenao Zhan, Chong Chen, Wenzhi Lv, Qian Tao, Ziyong Sun, Liming Xia , "Correlation of Chest CT and RT-PCR Testing in Coronavirus Disease 2019 (COVID-19) in China: A Report of 1014 Cases", February 26, 2020, Available at: \url{https://pubs.rsna.org/doi/10.1148/radiol.2020200642}
179
180\bibitem{Overleaf8}Ming-Yen Ng, Elaine YP Lee, Jin Yang, Fangfang Yang, Xia Li, Hongxia Wang, Macy Mei-sze Lui, Christine Shing-Yen Lo, Barry Leung, Pek-Lan Khong, Christopher Kim-Ming Hui, Kwok-yung Yuen, Michael David Kuo, "Imaging Profile of the COVID-19 Infection: Radiologic Findings and Literature Review", February 13 2020, Available at: \url{https://pubs.rsna.org/doi/10.1148/ryct.2020200034}
181
182\bibitem{Overleaf9}tensorflow.org, Available at: \url{https://www.tensorflow.org/}
183
184\bibitem{Overleaf10}keras.io, Available at: \url{https://keras.io/}
185
186\bibitem{Overleaf11}pytorch.org, Available at: \url{https://pytorch.org/}
187
188\bibitem{Overleaf12}kaggle.com, Available at: \url{https://www.kaggle.com/}
189
190https://www.kaggle.com/
191
192\end{thebibliography}
193
194
195
196\end{document}