python源码阅读-对象

一切皆对象

Python从设计之初就是一门面向对象的语言,在python语言中,数字、字符串、列表、元组、类、函数等等都是对象。python的一切皆对象,Python的一切都可以赋值给变量或者作为参数传递给函数。

注意:这里的对象并不是Python语言中的object

PyObject

一切python中的对象起始于PyObject,定义在Include.object.h,这个结构可以说是一切对象的基础,让我们来看看:

1
2
3
4
5
6
// object.h
typedef struct _object {
_PyObject_HEAD_EXTRA
Py_ssize_t ob_refcnt;
struct _typeobject *ob_type;
} PyObject;

首先看看_PyObject_HEAD_EXTRA,还是在这个文件中:

1
2
3
4
/* Define pointers to support a doubly-linked list of all live heap objects. */
#define _PyObject_HEAD_EXTRA \
struct _object *_ob_next; \
struct _object *_ob_prev;

可以看到_PyObject_HEAD_EXTRA定义了两个指针,类似于双向链表的前置指针与后续指针,注释中说的差不多也是这个意思。

然后是两个成员的结构体:

  • ob_refcnt,引用记数
  • ob_type,类型对象的指针
    其类型分别为Py_ssize_tstruct _typeobject

Py_ssize_t

ob_refcnt是Python的内存管理机制中基于引用计数的垃圾回收机制的对象引用数。对于一个对象A,当有一个PyObject *引用了该对象A时,A的引用计数就会增加,而当引用A的这个PyObject *被删除时,相应的引用计数就会减少,当对象A的引用计数减到0时,对象A对应的析构函数就会被调用,以释放内存。

ob_refcntPy_ssize_t类型,定义在Include/pyport.h

1
2
3
4
5
6
7
8
9
10
11
/* Py_ssize_t is a signed integral type such that sizeof(Py_ssize_t) ==
* sizeof(size_t). C99 doesn't define such a thing directly (size_t is an
* unsigned integral type). See PEP 353 for details.
*/
#ifdef HAVE_SSIZE_T
typedef ssize_t Py_ssize_t;
#elif SIZEOF_VOID_P == SIZEOF_SIZE_T
typedef Py_intptr_t Py_ssize_t;
#else
# error "Python needs a typedef for Py_ssize_t in pyport.h."
#endif

Py_ssize_t是一个所占字节数与ssize_t相同的有符号的整数类型(C99中没有定义ssize_t这种类型,某些编译器比如gcc扩展有该类型)

PyTypeObject

ob_type是一个指向结构体_typeobject的指针,用来指定一个类型(type)对象,定义在Include/objects.h

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
#ifdef Py_LIMITED_API
typedef struct _typeobject PyTypeObject; /* opaque */
#else
typedef struct _typeobject {
PyObject_VAR_HEAD
const char *tp_name; /* For printing, in format "<module>.<name>" */
Py_ssize_t tp_basicsize, tp_itemsize; /* For allocation */

/* Methods to implement standard operations */

destructor tp_dealloc;
printfunc tp_print;
getattrfunc tp_getattr;
setattrfunc tp_setattr;
PyAsyncMethods *tp_as_async; /* formerly known as tp_compare (Python 2)
or tp_reserved (Python 3) */
reprfunc tp_repr;

/* Method suites for standard classes */

PyNumberMethods *tp_as_number;
PySequenceMethods *tp_as_sequence;
PyMappingMethods *tp_as_mapping;

/* More standard operations (here for binary compatibility) */

hashfunc tp_hash;
ternaryfunc tp_call;
reprfunc tp_str;
getattrofunc tp_getattro;
setattrofunc tp_setattro;

/* Functions to access object as input/output buffer */
PyBufferProcs *tp_as_buffer;

/* Flags to define presence of optional/expanded features */
unsigned long tp_flags;

const char *tp_doc; /* Documentation string */

/* Assigned meaning in release 2.0 */
/* call function for all accessible objects */
traverseproc tp_traverse;

/* delete references to contained objects */
inquiry tp_clear;

/* Assigned meaning in release 2.1 */
/* rich comparisons */
richcmpfunc tp_richcompare;

/* weak reference enabler */
Py_ssize_t tp_weaklistoffset;

/* Iterators */
getiterfunc tp_iter;
iternextfunc tp_iternext;

/* Attribute descriptor and subclassing stuff */
struct PyMethodDef *tp_methods;
struct PyMemberDef *tp_members;
struct PyGetSetDef *tp_getset;
struct _typeobject *tp_base;
PyObject *tp_dict;
descrgetfunc tp_descr_get;
descrsetfunc tp_descr_set;
Py_ssize_t tp_dictoffset;
initproc tp_init;
allocfunc tp_alloc;
newfunc tp_new;
freefunc tp_free; /* Low-level free-memory routine */
inquiry tp_is_gc; /* For PyObject_IS_GC */
PyObject *tp_bases;
PyObject *tp_mro; /* method resolution order */
PyObject *tp_cache;
PyObject *tp_subclasses;
PyObject *tp_weaklist;
destructor tp_del;

/* Type attribute cache version tag. Added in version 2.6 */
unsigned int tp_version_tag;

destructor tp_finalize;

#ifdef COUNT_ALLOCS
/* these must be last and never explicitly initialized */
Py_ssize_t tp_allocs;
Py_ssize_t tp_frees;
Py_ssize_t tp_maxalloc;
struct _typeobject *tp_prev;
struct _typeobject *tp_next;
#endif
} PyTypeObject;
#endif

与大多数标准类型相比,PyTypeObject显得十分庞大。原因是每个类型对象存储大量值,主要是C函数指针,每个指针都实现了类型功能的一小部分。

_typeobject结构体中,以宏PyObject_VAR_HEAD开头,对应是一个变长对象
#define PyObject_VAR_HEAD PyVarObject ob_base;

PyVarObject扩展了PyObject,增加了定义可变部分的数目:

1
2
3
4
typedef struct {
PyObject ob_base;
Py_ssize_t ob_size; /* Number of items in variable part */
} PyVarObject;

_typeobject中除了宏PyObject_VAR_HEAD以外的成员,可以分为四类:

  • tp_name,类型名,主要是Python内部以及调试的时候使用,用来识别对象的类型;
  • tp_basicsizetp_itemsize,创建该类型对象时,分配内存空间大小的信息;
  • 类型对象对应的操作(诸如tp_print这样的许多的函数指针);
  • 类型对象的类型信息

更多字段解析可参看Type Objects