So a CSV file is a table of values separated by commas. Hence the name: “Comma-Separated Values”, or CSV.
we can access the property of an object by accessing it as an attribute. A book object, for example, might have a title property, which we can access by calling book.title. Columns in a pandas DataFrame work in much the same way.
index-based selection: selecting data based on its numerical position in the data. iloc follows this paradigm.
Both loc and iloc are row-first, column-second. This is the opposite of what we do in native Python, which is column-first, row-second.
The second paradigm for attribute selection is the one followed by the loc operator: label-based selection. In this paradigm, it’s the data index value, not its position, which matters.
Choosing between loc and iloc,the two methods use slightly different indexing schemes.
iloc uses the Python stdlib indexing scheme, where the first element of the range is included and the last one excluded. So 0:10 will select entries 0,…,9. loc, meanwhile, indexes inclusively. So 0:10 will select entries 0,…,10. Why the change? Remember that loc can index any stdlib type: strings, for example. If we have a DataFrame with index values Apples, …, Potatoes, …, and we want to select “all the alphabetical fruit choices between Apples and Potatoes”, then it’s a lot more convenient to index df.loc[‘Apples’:‘Potatoes’] than it is to index something like df.loc[‘Apples’, ‘Potatoet’] (t coming after s in the alphabet). This is particularly confusing when the DataFrame index is a simple numerical list, e.g. 0,…,1000. In this case df.iloc[0:1000] will return 1000 entries, while df.loc[0:1000] return 1001 of them! To get 1000 elements using loc, you will need to go one lower and ask for df.loc[0:999]. 关于这点,还是要解释下,首先,loc是指location的意思,iloc中的i是指integer。这两者的区别如下:loc是根据index来索引,比如读入的df定义了一个index,那么loc就根据这个index来索引对应的行。iloc并不是根据index来索引,而是根据行号来索引,行号从0开始,逐次加1。这里有篇文章帮助理解:https://zhuanlan.zhihu.com/p/129898162