刘吉祥 发表于 2024-4-11 19:44:48

pandas求行最大值及其索引的实现

在平时训练完模型后,需要对模型预测的值做进一步的数据操作,例如在对模型得到类别的概率值按行取最大值,并将最大值所在的列单独放一列。
数据格式如下:
array
array([[ 0.47288769,0.23982215,0.2261405 ,0.06114962],
       [ 0.67969596,0.11435176,0.17647322,0.02947907],
       [ 0.00621393,0.01652142,0.31117165,0.66609299],
       [ 0.24093366,0.23636758,0.30113828,0.22156043],
       [ 0.44093642,0.2245989 ,0.24515967,0.08930501],
       [ 0.05540339,0.10013942,0.30361843,0.54083872],
       [ 0.11221886,0.75674808,0.09237131,0.03866173],
       [ 0.24885316,0.28243011,0.28312165,0.18559511],
       [ 0.01205211,0.03740638,0.271065,0.67947656]], dtype=float32)想在想实现的功能是在上述DataFrame后面增加两列:一列是最大值,一列是最大值所在的行索引。
首先先来了解一下argmax函数。
argmax(a, axis=None)

# a 表示DataFrame

# axis 表示指定的轴,默认是None,表示把array平铺,等于1表示按行,等于0表示按列。对于DataFrame来说,求解过程如下:
代码如下:
#导入库
import pandas as pd
import numpy as np
#将array转化为DataFrame
arr=pd.DataFrame(array,columns=["one","two","three","four"])
#分别求行最大值及最大值所在索引
arr['max_value']=arr.max(axis=1)
arr['max_index']=np.argmax(array,axis=1)
#得出如下结果:
arr
Out:
      one       two   three      fourmax_indexmax_value
00.4728880.2398220.2261400.061150          0   0.472888
10.6796960.1143520.1764730.029479          0   0.679696
20.0062140.0165210.3111720.666093          3   3.000000
30.2409340.2363680.3011380.221560          2   2.000000
40.4409360.2245990.2451600.089305          0   0.440936
50.0554030.1001390.3036180.540839          3   3.000000
60.1122190.7567480.0923710.038662          1   1.000000
70.2488530.2824300.2831220.185595          2   2.000000
80.0120520.0374060.2710650.679477          3   3.000000假如现在要找出行第二大的值及其索引时,该怎么操作呢:
解决思路:可以将行的最大值置为0,然后在寻找每行的最大值及其索引。
具体代码实现过程如下:
#将最大值置为0
array=0
array
array([[ 0.      ,0.23982215,0.2261405 ,0.06114962],
       [ 0.      ,0.11435176,0.17647322,0.02947907],
       [ 0.00621393,0.01652142,0.31117165,0.      ],
       [ 0.24093366,0.23636758,0.      ,0.22156043],
       [ 0.      ,0.2245989 ,0.24515967,0.08930501],
       [ 0.05540339,0.10013942,0.30361843,0.      ],
       [ 0.11221886,0.      ,0.09237131,0.03866173],
       [ 0.24885316,0.28243011,0.      ,0.18559511],
       [ 0.01205211,0.03740638,0.271065,0.      ]], dtype=float32)
#取出第二大值及其索引
arr['second_value']=array.max(axis=1)
arr['second_index']=np.argmax(array,axis=1)
arr
Out:
      one       two   three      fourmax_valuemax_indexsecond_value\
00.4728880.2398220.2261400.061150   0.472888          0      0.239822   
10.6796960.1143520.1764730.029479   0.679696          0      0.176473   
20.0062140.0165210.3111720.666093   0.666093          3      0.311172   
30.2409340.2363680.3011380.221560   0.301138          2      0.240934   
40.4409360.2245990.2451600.089305   0.440936          0      0.245160   
50.0554030.1001390.3036180.540839   0.540839          3      0.303618   
60.1122190.7567480.0923710.038662   0.756748          1      0.112219   
70.2488530.2824300.2831220.185595   0.283122          2      0.282430   
80.0120520.0374060.2710650.679477   0.679477          3      0.271065   

   second_index
0             1
1             2
2             2
3             0
4             2
5             2
6             0
7             1
8             2 到此这篇关于pandas求行最大值及其索引的实现的文章就介绍到这了,更多相关pandas求行最大值及索引内容请搜索脚本之家以前的文章或继续浏览下面的相关文章希望大家以后多多支持脚本之家!

来源:https://www.jb51.net/python/319177j8u.htm
免责声明:由于采集信息均来自互联网,如果侵犯了您的权益,请联系我们【E-Mail:cb@itdo.tech】 我们会及时删除侵权内容,谢谢合作!
页: [1]
查看完整版本: pandas求行最大值及其索引的实现