### 华为网络AI学习赛2021-硬盘异常检测
[华为网络AI学习赛2021-硬盘异常检测](https://competition.huaweicloud.com/information/1000041370/introduction)
### pandas 查看训练数据集的基本信息
``` python
import os
os.chdir("/home/ma-user/work/disk")
import pandas as pd
from naie.datasets import get_data_reference
dr_train = get_data_reference("DatasetService", "learning_training_data", file_type='csv')
train_data = dr_train.to_pandas_dataframe()
```
``` python
# 查看前6行数据
train_data.head(n=6)
```

``` python
# 查看训练数据集的大小
train_data.shape
(558636, 109)
```
``` python
# 查看训练数据集的行标签
train_data.index.values
array([ 0, 1, 2, ..., 558633, 558634, 558635])
```
``` python
# 查看训练数据集的列标签
# 转成list的话,可以使用 train_data.columns.values.tolist()
train_data.columns.values
array(['date_g', 'serial_number', 'model', 'capacity_bytes', 'failure',
'smart_1_normalized', 'smart_1_raw', 'smart_2_normalized',
'smart_2_raw', 'smart_3_normalized', 'smart_3_raw',
'smart_4_normalized', 'smart_4_raw', 'smart_5_normalized',
'smart_5_raw', 'smart_7_normalized', 'smart_7_raw',
'smart_8_normalized', 'smart_8_raw', 'smart_9_normalized',
'smart_9_raw', 'smart_10_normalized', 'smart_10_raw',
'smart_11_normalized', 'smart_11_raw', 'smart_12_normalized',
'smart_12_raw', 'smart_13_normalized', 'smart_13_raw',
'smart_15_normalized', 'smart_15_raw', 'smart_22_normalized',
'smart_22_raw', 'smart_23_normalized', 'smart_23_raw',
'smart_24_normalized', 'smart_24_raw', 'smart_177_normalized',
'smart_177_raw', 'smart_179_normalized', 'smart_179_raw',
'smart_181_normalized', 'smart_181_raw', 'smart_182_normalized',
'smart_182_raw', 'smart_183_normalized', 'smart_183_raw',
'smart_184_normalized', 'smart_184_raw', 'smart_187_normalized',
'smart_187_raw', 'smart_188_normalized', 'smart_188_raw',
'smart_189_normalized', 'smart_189_raw', 'smart_190_normalized',
'smart_190_raw', 'smart_191_normalized', 'smart_191_raw',
'smart_192_normalized', 'smart_192_raw', 'smart_193_normalized',
'smart_193_raw', 'smart_194_normalized', 'smart_194_raw',
'smart_195_normalized', 'smart_195_raw', 'smart_196_normalized',
'smart_196_raw', 'smart_197_normalized', 'smart_197_raw',
'smart_198_normalized', 'smart_198_raw', 'smart_199_normalized',
'smart_199_raw', 'smart_200_normalized', 'smart_200_raw',
'smart_201_normalized', 'smart_201_raw', 'smart_220_normalized',
'smart_220_raw', 'smart_222_normalized', 'smart_222_raw',
'smart_223_normalized', 'smart_223_raw', 'smart_224_normalized',
'smart_224_raw', 'smart_225_normalized', 'smart_225_raw',
'smart_226_normalized', 'smart_226_raw', 'smart_235_normalized',
'smart_235_raw', 'smart_240_normalized', 'smart_240_raw',
'smart_241_normalized', 'smart_241_raw', 'smart_242_normalized',
'smart_242_raw', 'smart_250_normalized', 'smart_250_raw',
'smart_251_normalized', 'smart_251_raw', 'smart_252_normalized',
'smart_252_raw', 'smart_254_normalized', 'smart_254_raw',
'smart_255_normalized', 'smart_255_raw'], dtype=object)
```
``` python
# 摘要信息
train_data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 558636 entries, 0 to 558635
Columns: 109 entries, date_g to smart_255_raw
dtypes: float64(104), int64(2), object(3)
memory usage: 464.6+ MB
```
``` python
# 描述统计信息(boxplot中提到的数值)
train_data.describe()
```

### 学习资源和参考资料
* [【2021学习赛---硬盘异常检测】2月23号直播ppt](https://bbs.huaweicloud.com/forum/thread-108940-1-1.html)
* [【学习赛2021--硬盘异常检测】样例代码](https://bbs.huaweicloud.com/forum/thread-107416-1-1.html)
* https://blog.csdn.net/qq_40981268/article/details/86566068
* https://www.itdiandi.net/view/1800
* https://blog.csdn.net/qq_37975685/article/details/107953941
### 其他学习赛推荐
* [华为网络AI学习赛2021-KPI异常检测](https://competition.huaweicloud.com/information/1000041344/introduction)
* [华为网络AI学习赛2021-日志异常检测](https://competition.huaweicloud.com/information/1000041371/introduction)
### 备注
1. 感谢老师的教学与课件
2. 欢迎各位同学一起来交流比赛心得^_^
3. 比赛配备了较为丰富的学习资料,有助于新手平稳入门,推荐参赛
