《Python数据分析技术栈》第06章使用 Pandas 准备数据 04 DataFrames

发布时间:2024年01月22日

04 DataFrames

《Python数据分析技术栈》第06章使用 Pandas 准备数据 04 DataFrames

A DataFrame is an extension of a Series. It is a two-dimensional data structure for storing data. While the Series object contains two components - a set of values, and index labels attached to these values - the DataFrame object contains three components - the column object, index object, and a NumPy array object that contains the values.

DataFrame 是 Series 的扩展。它是一种用于存储数据的二维数据结构。Series 对象包含两个部分–一组数值和连接到这些数值的索引标签,而 DataFrame 对象包含三个部分–列对象、索引对象和包含数值的 NumPy 数组对象。

The index and columns are collectively called the axes. The index forms the axis “0” and the columns form the axis “1”

索引和列统称为轴。索引构成轴 “0”,列构成轴 “1”。

We look at various methods for creating DataFrames in Table 6-2.

我们将在表 6-2 中介绍创建 DataFrames 的各种方法。

By combining Series objects:Here, we are defining two Series and then using the pd.DataFrame function to create a new DataFrame called “combined_ages”. We give names to columns in a separate step.

通过组合系列对象:这里,我们定义了两个系列,然后使用 pd.DataFrame 函数创建一个名为 "combined_ages "的新 DataFrame。我们将在另一个步骤中为列命名。

student_ages=pd.Series([22,24,20]) #series 1
teacher_ages=pd.Series([40,50,45])#series 2
combined_ages=pd.DataFrame([student_ages,teacher_ages]) #DataFrame
combined_ages.columns=['class 1','class 2','class 3']#naming columnscombined_ages

From a dictionary:A dictionary is passed as an argument to the pd.DataFrame function (with the column names forming keys, and values in each column enclosed in a list).

从字典:字典作为参数传递给 pd.DataFrame 函数(列名构成键,每列的值用列表括起来)。

combined_ages=pd.DataFrame({'class 1':[22,40],'class2':[24,50],'class 3':[20,45]})
combined_ages

From a numpy array:Here, we create a NumPy array first using the np.arange function. Then we reshape this array into a DataFrame with two rows and four columns.

从一个 numpy 数组:在这里,我们首先使用 np.arange 函数创建一个 NumPy 数组。然后,我们将该数组重塑为两行四列的 DataFrame。

numerical_df=pd.DataFrame(np.arange(1,9).reshape(2,4))
numerical_df

Using a set of tuples:We have re-created the “combined_ages” DataFrame using a set of tuples. Each tuple is equivalent to a row in a DataFrame.

使用元组集:我们使用元组集重新创建了 “combined_ages” DataFrame。每个元组相当于 DataFrame 中的一行。

combined_ages=pd.DataFrame([(22,24,20),(40,50,45)],columns=['class 1','class 2','class 3'])
combined_ages

To sum up, we can create a DataFrame using a dictionary, a set of tuples, and by combining Series objects. Each of these methods uses the pd.DataFrame function. Note that the characters “D” and “F” in this method are in uppercase; pd.dataframe does not work.

总之,我们可以使用字典、元组集和组合系列对象来创建 DataFrame。每种方法都使用 pd.DataFrame 函数。请注意,该方法中的字符 "D "和 "F "都是大写字母;pd.dataframe 不起作用。

文章来源:https://blog.csdn.net/qq_37703224/article/details/135739049
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。