covid-19 infection model

Y-h. Taguchi, a professor at Chuo University, looks toward a COVID-19 infection model which uses human patients blood who have been infected by COVID-19

2020 onwards saw the COVID-19 pandemic started to converge in almost all countries. Although vaccines currently seem to be effective in lowering the COVID-19 mortality rate, the virus continues to mutate the likelihood of another lockdown being introduced continues to rise. In order to avoid these situations, we definitely need effective drugs that are not yet developed and an effective COVID-19 infection model.

In our previous articles published in the Open Access Government publication [1,2] we introduced our recent efforts to develop effective drugs to COVID-19 using computers.

However, our studies described in the previous articles could make use of only human and mice cell lines. If we can make use of measurements directly using human patients infected by COVID-19, we might be able to have better results.

Using human cell lines to understand COVID-19

Recently, the research groups headed by Assistant Prof Miyata, Ryukyu University and Prof Ikematsu, National Institute of Technology, Okinawa College, under the collaboration with us, employed our methods to analyse gene expression of blood retrieved from human COVID-19 patients [3].

This study has both advantages and disadvantages if compared with the studies described in the previous manuscripts [1,2]. Since it is the direct measurement from human patients, the measurement is more direct than those that used cell lines.

However, since it is not taken from the lung, where infection takes place, but instead from the blood, it is indirect in this sense. Thus, it is unclear if the replacement of human lung cell lines with human blood can improve the result or not. The only way to figure out this point is a practical trial.

covid-19 infection model


Practical trial of gene datasets

The research team downloaded two sets of publicly available datasets, and applied our method that they named PCAUFE, to it.

They found that as small as 123 genes are differentially expressed between healthy controls and COVID-19 patients in the first data set. Since the total number of human genes is 20,000, 123 genes are very limited and a small part of them.

To confirm if this appears too small a number of genes have the capability to differentiate COVID-19 patients from healthy controls, the research group constructed three machine learning models to classify two groups, patients and healthy control, only using the selected 123 genes; three models were tested using the second public data set independent of the first data set.

In order to validate the efficiency of classification performance, the research group employed AUC, which takes 1.00 for perfect performance and 0.5 for random selection. Three models trained by 123 genes could achieve AUC more than 0.9, which means excellent performance. Although the same procedure is repeated with exchanging two data sets, i.e., the model is trained with the second data set and is tested with the first data set, it can achieve similar performance. This means that the results are robust. Thus, despite the very small number of selected genes, they can successfully discriminate COVID-19 patients from healthy controls.

In addition to this, to confirm the superiority of PCAUFE, the research group also employed other state of art methods to select genes that are differentially expressed between COVID-19 patients and healthy controls. Although classification performance using genes selected by the state of art methods is comparative with PCAUFE when only top-ranked same number of genes as those selected by PCAUFE are used. Whereas the number of probes selected by the state of art methods is as many as several thousand to eighteen thousand. Thus, the state of art methods has an inferior ability to restrict the number of genes used for the classification.

Enriching 123 genes

Next, the research group investigated what kind of functions are enriched in the selected 123 genes. Then they have found that the expression of many immune-related genes included in these 123 genes is downregulated in COVID-19 patients’ blood. In addition to this, many biological pathways and transcription factors enriched in these genes are previously reported to suppress in COVID-19 patients.

These suggest that not only PCAUFE can identify genes whose expression can discriminate between COVID-19 patients and healthy control (i.e., biomarkers), but also it can identify a restricted number of probable disease-causing genes.

The finding that patients’ blood samples can be made used for the COVID-19 disease investigation creating a COVID-19 infection model is remarkable.

First, if not lung tissue but blood can be an effective tissue to be investigated, it is much easier to collect. Collecting a massive number of lung samples from COVID-19 is hopeless, but collecting blood samples is feasible. Since blood samples can be used for diagnosing, it is easy to monitor disease progression, which enables us to find the timing to treat with drugs if identified.

Unfortunately, the research team has not yet started to identify possible drug candidate compounds using identified 123 genes, it will soon be performed, and they can get promising candidate drug compounds.



[1] Y-h. Taguchi, How to compete with COVID-19 with a computer? Open Access Government, issue 33, Jan. (2022) pp. 210-211.

[2] Y-h. Taguchi, Can mice be an effective model animal for Covid-19? Open Access Government, issue 34, April (2022) pp.112-113.

[3] Fujisawa, K., Shimo, M., Taguchi, YH. et al. PCA-based unsupervised feature extraction for gene expression analysis of COVID-19 patients. Sci Rep 11, 17351 (2021).


Please note: This is a commercial profile

© 2019. This work is licensed under CC-BY-NC-ND.

More About Stakeholder

Contributor Profile

Chuo University
Phone: +81 3 3817 1791
Website: Visit Website


Please enter your comment!
Please enter your name here