建议使用以下浏览器,以获得最佳体验。 IE 9.0+以上版本 Chrome 31+ 谷歌浏览器 Firefox 30+ 火狐浏览器
温馨提示

抱歉,您需设置社区昵称后才能参与社区互动!

前往修改
我再想想

华为云大赛技术圈

话题 : 467 成员 : 405

加入HCSD

【学习赛2021-硬盘异常检测】【分享总结】Pandas 查看训练数据集的基本信息

年月日 2021/3/1 474
### 华为网络AI学习赛2021-硬盘异常检测 [华为网络AI学习赛2021-硬盘异常检测](https://competition.huaweicloud.com/information/1000041370/introduction)   ### pandas 查看训练数据集的基本信息 ``` python import os os.chdir("/home/ma-user/work/disk") import pandas as pd from naie.datasets import get_data_reference dr_train = get_data_reference("DatasetService", "learning_training_data", file_type='csv') train_data = dr_train.to_pandas_dataframe() ``` ``` python # 查看前6行数据 train_data.head(n=6) ``` ![naie-comp-disk-pandas-show-dataset-1.png](https://bbs-img-cbc-cn.obs.cn-north-1.myhuaweicloud.com/data/forums/attachment/forum/202102/25/124408vgbrlbmpcywhcqs1.png) ``` python # 查看训练数据集的大小 train_data.shape (558636, 109) ``` ``` python # 查看训练数据集的行标签 train_data.index.values array([ 0, 1, 2, ..., 558633, 558634, 558635]) ``` ``` python # 查看训练数据集的列标签 # 转成list的话,可以使用 train_data.columns.values.tolist() train_data.columns.values array(['date_g', 'serial_number', 'model', 'capacity_bytes', 'failure', 'smart_1_normalized', 'smart_1_raw', 'smart_2_normalized', 'smart_2_raw', 'smart_3_normalized', 'smart_3_raw', 'smart_4_normalized', 'smart_4_raw', 'smart_5_normalized', 'smart_5_raw', 'smart_7_normalized', 'smart_7_raw', 'smart_8_normalized', 'smart_8_raw', 'smart_9_normalized', 'smart_9_raw', 'smart_10_normalized', 'smart_10_raw', 'smart_11_normalized', 'smart_11_raw', 'smart_12_normalized', 'smart_12_raw', 'smart_13_normalized', 'smart_13_raw', 'smart_15_normalized', 'smart_15_raw', 'smart_22_normalized', 'smart_22_raw', 'smart_23_normalized', 'smart_23_raw', 'smart_24_normalized', 'smart_24_raw', 'smart_177_normalized', 'smart_177_raw', 'smart_179_normalized', 'smart_179_raw', 'smart_181_normalized', 'smart_181_raw', 'smart_182_normalized', 'smart_182_raw', 'smart_183_normalized', 'smart_183_raw', 'smart_184_normalized', 'smart_184_raw', 'smart_187_normalized', 'smart_187_raw', 'smart_188_normalized', 'smart_188_raw', 'smart_189_normalized', 'smart_189_raw', 'smart_190_normalized', 'smart_190_raw', 'smart_191_normalized', 'smart_191_raw', 'smart_192_normalized', 'smart_192_raw', 'smart_193_normalized', 'smart_193_raw', 'smart_194_normalized', 'smart_194_raw', 'smart_195_normalized', 'smart_195_raw', 'smart_196_normalized', 'smart_196_raw', 'smart_197_normalized', 'smart_197_raw', 'smart_198_normalized', 'smart_198_raw', 'smart_199_normalized', 'smart_199_raw', 'smart_200_normalized', 'smart_200_raw', 'smart_201_normalized', 'smart_201_raw', 'smart_220_normalized', 'smart_220_raw', 'smart_222_normalized', 'smart_222_raw', 'smart_223_normalized', 'smart_223_raw', 'smart_224_normalized', 'smart_224_raw', 'smart_225_normalized', 'smart_225_raw', 'smart_226_normalized', 'smart_226_raw', 'smart_235_normalized', 'smart_235_raw', 'smart_240_normalized', 'smart_240_raw', 'smart_241_normalized', 'smart_241_raw', 'smart_242_normalized', 'smart_242_raw', 'smart_250_normalized', 'smart_250_raw', 'smart_251_normalized', 'smart_251_raw', 'smart_252_normalized', 'smart_252_raw', 'smart_254_normalized', 'smart_254_raw', 'smart_255_normalized', 'smart_255_raw'], dtype=object) ``` ``` python # 摘要信息 train_data.info() <class 'pandas.core.frame.DataFrame'> RangeIndex: 558636 entries, 0 to 558635 Columns: 109 entries, date_g to smart_255_raw dtypes: float64(104), int64(2), object(3) memory usage: 464.6+ MB ``` ``` python # 描述统计信息(boxplot中提到的数值) train_data.describe() ``` ![naie-comp-disk-pandas-show-dataset-2.png](https://bbs-img-cbc-cn.obs.cn-north-1.myhuaweicloud.com/data/forums/attachment/forum/202102/25/124414m2admj8zmvjwnd1o.png)   ### 学习资源和参考资料 * [【2021学习赛---硬盘异常检测】2月23号直播ppt](https://bbs.huaweicloud.com/forum/thread-108940-1-1.html) * [【学习赛2021--硬盘异常检测】样例代码](https://bbs.huaweicloud.com/forum/thread-107416-1-1.html) * https://blog.csdn.net/qq_40981268/article/details/86566068 * https://www.itdiandi.net/view/1800 * https://blog.csdn.net/qq_37975685/article/details/107953941   ### 其他学习赛推荐 * [华为网络AI学习赛2021-KPI异常检测](https://competition.huaweicloud.com/information/1000041344/introduction) * [华为网络AI学习赛2021-日志异常检测](https://competition.huaweicloud.com/information/1000041371/introduction)   ### 备注 1. 感谢老师的教学与课件 2. 欢迎各位同学一起来交流比赛心得^_^ 3. 比赛配备了较为丰富的学习资料,有助于新手平稳入门,推荐参赛

回复 (0)

没有评论
上划加载中
标签
您还可以添加5个标签
  • 没有搜索到和“关键字”相关的标签
  • 云产品
  • 解决方案
  • 技术领域
  • 通用技术
  • 平台功能
取消

年月日

角色:成员

话题:25

发消息
发表于2021年03月01日 16:32:11 4740
直达本楼层的链接
楼主
正序浏览 只看该作者
[技术干货] 【学习赛2021-硬盘异常检测】【分享总结】Pandas 查看训练数据集的基本信息

### 华为网络AI学习赛2021-硬盘异常检测 [华为网络AI学习赛2021-硬盘异常检测](https://competition.huaweicloud.com/information/1000041370/introduction)   ### pandas 查看训练数据集的基本信息 ``` python import os os.chdir("/home/ma-user/work/disk") import pandas as pd from naie.datasets import get_data_reference dr_train = get_data_reference("DatasetService", "learning_training_data", file_type='csv') train_data = dr_train.to_pandas_dataframe() ``` ``` python # 查看前6行数据 train_data.head(n=6) ``` ![naie-comp-disk-pandas-show-dataset-1.png](https://bbs-img-cbc-cn.obs.cn-north-1.myhuaweicloud.com/data/forums/attachment/forum/202102/25/124408vgbrlbmpcywhcqs1.png) ``` python # 查看训练数据集的大小 train_data.shape (558636, 109) ``` ``` python # 查看训练数据集的行标签 train_data.index.values array([ 0, 1, 2, ..., 558633, 558634, 558635]) ``` ``` python # 查看训练数据集的列标签 # 转成list的话,可以使用 train_data.columns.values.tolist() train_data.columns.values array(['date_g', 'serial_number', 'model', 'capacity_bytes', 'failure', 'smart_1_normalized', 'smart_1_raw', 'smart_2_normalized', 'smart_2_raw', 'smart_3_normalized', 'smart_3_raw', 'smart_4_normalized', 'smart_4_raw', 'smart_5_normalized', 'smart_5_raw', 'smart_7_normalized', 'smart_7_raw', 'smart_8_normalized', 'smart_8_raw', 'smart_9_normalized', 'smart_9_raw', 'smart_10_normalized', 'smart_10_raw', 'smart_11_normalized', 'smart_11_raw', 'smart_12_normalized', 'smart_12_raw', 'smart_13_normalized', 'smart_13_raw', 'smart_15_normalized', 'smart_15_raw', 'smart_22_normalized', 'smart_22_raw', 'smart_23_normalized', 'smart_23_raw', 'smart_24_normalized', 'smart_24_raw', 'smart_177_normalized', 'smart_177_raw', 'smart_179_normalized', 'smart_179_raw', 'smart_181_normalized', 'smart_181_raw', 'smart_182_normalized', 'smart_182_raw', 'smart_183_normalized', 'smart_183_raw', 'smart_184_normalized', 'smart_184_raw', 'smart_187_normalized', 'smart_187_raw', 'smart_188_normalized', 'smart_188_raw', 'smart_189_normalized', 'smart_189_raw', 'smart_190_normalized', 'smart_190_raw', 'smart_191_normalized', 'smart_191_raw', 'smart_192_normalized', 'smart_192_raw', 'smart_193_normalized', 'smart_193_raw', 'smart_194_normalized', 'smart_194_raw', 'smart_195_normalized', 'smart_195_raw', 'smart_196_normalized', 'smart_196_raw', 'smart_197_normalized', 'smart_197_raw', 'smart_198_normalized', 'smart_198_raw', 'smart_199_normalized', 'smart_199_raw', 'smart_200_normalized', 'smart_200_raw', 'smart_201_normalized', 'smart_201_raw', 'smart_220_normalized', 'smart_220_raw', 'smart_222_normalized', 'smart_222_raw', 'smart_223_normalized', 'smart_223_raw', 'smart_224_normalized', 'smart_224_raw', 'smart_225_normalized', 'smart_225_raw', 'smart_226_normalized', 'smart_226_raw', 'smart_235_normalized', 'smart_235_raw', 'smart_240_normalized', 'smart_240_raw', 'smart_241_normalized', 'smart_241_raw', 'smart_242_normalized', 'smart_242_raw', 'smart_250_normalized', 'smart_250_raw', 'smart_251_normalized', 'smart_251_raw', 'smart_252_normalized', 'smart_252_raw', 'smart_254_normalized', 'smart_254_raw', 'smart_255_normalized', 'smart_255_raw'], dtype=object) ``` ``` python # 摘要信息 train_data.info() <class 'pandas.core.frame.DataFrame'> RangeIndex: 558636 entries, 0 to 558635 Columns: 109 entries, date_g to smart_255_raw dtypes: float64(104), int64(2), object(3) memory usage: 464.6+ MB ``` ``` python # 描述统计信息(boxplot中提到的数值) train_data.describe() ``` ![naie-comp-disk-pandas-show-dataset-2.png](https://bbs-img-cbc-cn.obs.cn-north-1.myhuaweicloud.com/data/forums/attachment/forum/202102/25/124414m2admj8zmvjwnd1o.png)   ### 学习资源和参考资料 * [【2021学习赛---硬盘异常检测】2月23号直播ppt](https://bbs.huaweicloud.com/forum/thread-108940-1-1.html) * [【学习赛2021--硬盘异常检测】样例代码](https://bbs.huaweicloud.com/forum/thread-107416-1-1.html) * https://blog.csdn.net/qq_40981268/article/details/86566068 * https://www.itdiandi.net/view/1800 * https://blog.csdn.net/qq_37975685/article/details/107953941   ### 其他学习赛推荐 * [华为网络AI学习赛2021-KPI异常检测](https://competition.huaweicloud.com/information/1000041344/introduction) * [华为网络AI学习赛2021-日志异常检测](https://competition.huaweicloud.com/information/1000041371/introduction)   ### 备注 1. 感谢老师的教学与课件 2. 欢迎各位同学一起来交流比赛心得^_^ 3. 比赛配备了较为丰富的学习资料,有助于新手平稳入门,推荐参赛
点赞 举报
分享

分享文章到朋友圈

分享文章到微博

游客

您需要登录后才可以回帖 登录 | 立即注册