R语言实战——Nature Neuroscience的十年(一)

2020-06-17 00:00:00 作者 文章 统计 国家 发文

原创:hxj7

前言

学习R语言有半年时间了,一直想找个机会找个小项目练练手,所以才有了这篇文章。

目的

对《Nature Neuroscience》杂志2009-2018年的研究文章进行可视化初探。

主要分为以下几个部分:

1. 基础统计及可视化

2. 进一步统计及可视化

3. 主要国家比较及可视化

4. 热词统计及可视化

5. 对接收时间的统计建模及特征选择

(截至发文才完成第1和第2部分,其余部分只能有机会再做)

数据来源

从Nature杂志官网搜索文章,搜索参数为:

journal: neuro

subject: biological-sciences/health-sciences

article_type: research, review, protocol(不包括Introduction, Editorial)

time_range: 2009-2018

说明

1. 港澳台与大陆合并计算。

2. 如果不做特别说明,日期默认按照发表日期(Publlish Date)统计。

不足

1. 数据缺失或不规范为数据分析带来偏差。比如国家、省份、城市名称前后不一致或缺失。

2. 对NA的处理还不够完善。什么时候该去除NA是要仔细考量的。

3. 有些作者的英文名是重合的,计算时没有做区分。

4. 文章数相同的作者排名是按照姓名的字典序排序的。

5. 没有统计标准差。

6. R作图的一些细节还需要改善。

7. 实现的代码虽然经过简化,但还是有些冗余。

8. 还有很多有意思的东西限于时间经历和篇幅就暂不研究了。

声明

本文仅是个人练习的结果,肯定有谬误的地方,不具有任何参考价值,那些花里胡哨的话不要乱了心!

部分:基础统计及可视化

导入数据

数据预处理

观察数据

'data.frame': 2575 obs. of 21 variables:
 $ date : chr "2018-12-31" "2018-12-17" "2018-12-17" "2018-12-17" ...
 $ title : chr "Panoptic imaging of transparent mice reveals whole-body neuronal projections and skull–meninges connections" "TDP-43 extracted from frontotemporal lobar degeneration subject brains displays distinct aggregate assemblies a"| __truncated__ "Efficient coding of subjective value" "Invasion of white matter tracts by glioma stem cells is regulated by a NOTCH1–SOX2 positive-feedback loop" ...
 $ type : chr "Research" "Research" "Research" "Research" ...
 $ magzine : chr "Nature Neuroscience" "Nature Neuroscience" "Nature Neuroscience" "Nature Neuroscience" ...
 $ volume : int NA 22 22 22 22 22 22 22 22 22 ...
 $ startPage : int 1 65 134 91 120 57 78 37 106 25 ...
 $ endPage : int 11 77 142 105 133 64 90 46 119 36 ...
 $ abstract : chr "Analysis of entire transparent rodent bodies after clearing could provide holistic biological information in he"| __truncated__ "Accumulation of abnormally phosphorylated TDP-43 (pTDP-43) is the main pathology in affected neurons of people "| __truncated__ "Preference-based decisions are essential for survival, for instance, when deciding what we should (not) eat. De"| __truncated__ "Early invasive growth along specific anatomical structures, especially the white matter tract, is regarded as o"| __truncated__ ...
 $ receiveDate: chr "2018-04-01" "2018-09-10" "2018-01-20" "2018-04-06" ...
 $ reviseDate : chr "" "" "" "" ...
 $ acceptDate : chr "2018-11-21" "2018-11-14" "2018-11-13" "2018-10-31" ...
 $ author : chr "Ruiyao Cai|Chenchen Pan|Alireza Ghasemigharagoz|Mihail Ivilinov Todorov|Benjamin F<U+00F6>rstera|Shan Zhao|Hars"| __truncated__ "Florent Laferrière|Zuzanna Maniecka|Manuela Pérez-Berlanga|Marian Hruska-Plochan|Larissa Gilhespy|Eva-Maria Hoc"| __truncated__ "Rafael Polanía|Michael Woodford|Christian C. Ruff" "Jun Wang|Sen-Lin Xu|Jiang-Jie Duan|Liang Yi|Yu-Feng Guo|Yu Shi|Lin Li|Ze-Yu Yang|Xue-Mei Liao|Jiao Cai|Yan-Qi Z"| __truncated__ ...
 $ nauthor : int 22 23 3 22 4 10 18 17 20 15 ...
 $ ncoauthor : int 2 2 1 4 1 1 3 1 2 1 ...
 $ corresp : chr "Ali Ertürk" "Magdalini Polymenidou" "Rafael Polanía|Christian C. Ruff" "Xiu-Wu Bian|Shi-Cang Yu" ...
 $ ncorresp : int 1 1 2 2 1 2 2 1 3 2 ...
 $ institute : chr "Ludwig-Maximilians University Munich;Graduate School of Systemic Neurosciences Munich|Ludwig-Maximilians Univer"| __truncated__ "University of Zurich|University of Zurich|University of Zurich|University of Zurich|University of Zurich|Univer"| __truncated__ "University of Zurich;ETH Zurich;Columbia University|Columbia University|University of Zurich" "Army Medical University (Third Military Medical University);Army Medical University (Third Military Medical Uni"| __truncated__ ...
 $ city : chr "Munich;Munich|Munich;Munich|Munich|Munich;Munich|Munich|Munich|Munich|Munich|Munich|Munich;Munich|Munich|Copenh"| __truncated__ "Zurich|Zurich|Zurich|Zurich|Zurich|Zurich|Zurich|Zurich|Zurich|Zurich|London;London|London;London|London;London"| __truncated__ "Zurich;Zurich;New York|New York|Zurich" "Chongqing;Chongqing|Chongqing|Chongqing;Chongqing|Chongqing|Chongqing;Chongqing|Chongqing|Chongqing;Chongqing|C"| __truncated__ ...
 $ province : chr "Munich;Munich|Munich;Munich|Munich|Munich;Munich|Munich|Munich|Munich|Munich|Munich|Munich;Munich|Munich|Copenh"| __truncated__ "Zurich|Zurich|Zurich|Zurich|Zurich|Zurich|Zurich|Zurich|Zurich|Zurich|London;London|London;London|London;London"| __truncated__ "Zurich;Zurich;NY|NY|Zurich" "Chongqing;Chongqing|Chongqing|Chongqing;Chongqing|Chongqing|Chongqing;Chongqing|Chongqing|Chongqing;Chongqing|C"| __truncated__ ...
 $ country : chr "Germany;Germany|Germany;Germany|Germany|Germany;Germany|Germany|Germany|Germany|Germany|Germany|Germany;Germany"| __truncated__ "Switzerland|Switzerland|Switzerland|Switzerland|Switzerland|Switzerland|Switzerland|Switzerland|Switzerland|Swi"| __truncated__ "Switzerland;Switzerland;USA|USA|Switzerland" "China;China|China|China;China|China|China;China|China|China;China|China;China|China;China|China;China|China|Chi"| __truncated__ ...
 $ address : chr "Institute for Stroke and Dementia Research, Klinikum der Universit<U+00E4>t München, Ludwig-Maximilians Univers"| __truncated__ "Institute of Molecular Life Sciences, University of Zurich, Zurich, Switzerland|Institute of Molecular Life Sci"| __truncated__ "Zurich Center for Neuroeconomics (ZNE), Department of Economics, University of Zurich, Zurich, Switzerland;Deci"| __truncated__ "Institute of Pathology and Southwest Cancer Center, Key Laboratory of the Ministry of Education, Southwest Hosp"| __truncated__ ...

相关文章