以源碼探索 PHP 查詢 MySQL 後儲存資料結果的方式

基於以下系列討論內容的一大串源碼追蹤。以印證討論內容。

Remember this: Open your mind, use the source.

Source code version: PHP-5.2.5 release

/ext/mysql/php_mysql.c

1.line: 1395-1400, context of mysql_query()

/* proto resource mysql_query(string query [, int link_identifier])
   Sends an SQL query to MySQL */
PHP_FUNCTION(mysql_query)
{
    php_mysql_do_query(INTERNAL_FUNCTION_PARAM_PASSTHRU, MYSQL_STORE_RESULT);
}

2.line: 1404-1409, context of mysql_unbuffered_query()

/* proto resource mysql_unbuffered_query(string query [, int link_identifier])
   Sends an SQL query to MySQL, without fetching and buffering the result rows */
PHP_FUNCTION(mysql_unbuffered_query)
{
    php_mysql_do_query(INTERNAL_FUNCTION_PARAM_PASSTHRU, MYSQL_USE_RESULT);
}

3.line: 1345-1359, context of php_mysql_do_query_general()

if(use_store == MYSQL_USE_RESULT) {
        mysql_result=mysql_use_result(&mysql->conn);
    } else {
        mysql_result=mysql_store_result(&mysql->conn);
    }
    if (!mysql_result) {
        if (PHP_MYSQL_VALID_RESULT(&mysql->conn)) { /* query should have returned rows */
            php_error_docref(NULL TSRMLS_CC, E_WARNING, "Unable to save result set");
            RETURN_FALSE;
        } else {
            RETURN_TRUE;
        }
    }
    MySG(result_allocated)++;
    ZEND_REGISTER_RESOURCE(return_value, mysql_result, le_result);

上面的源碼內容符合 PHP Manual, PHP Large result sets and summary tables 等文的解釋。當程序員透過 mysql_query() 查詢時，資料結果集(result set)會被儲存在 client (即PHP 這端)。而用 mysql_unbuffered_query() 時，則只會在 clinet 儲存 "current row"。

MySQL 5.1 Reference Manual

看完 PHP 的源碼，再看 MySQL 的部份，以進一步確認 MySQL 的源碼實際上是否符合前述解釋。

PHP 源碼顯示，PHP 的 mysql_query() 調用 MySQL C API 的 mysql_store_result()，mysql_unbuffered_query() 調用 MySQL C API 的 mysql_use_result()。故列示於下。

mysql_store_result()

mysql_store_result() reads the entire result of a query to the client, allocates a MYSQL_RES structure, and places the result into this structure.

在 client 端配置一個 MYSQL_RES 的結構體，並儲放資料結果於此結構中。

mysql_use_result()

mysql_use_result() initiates a result set retrieval but does not actually read the result set into the client like mysql_store_result() does. Instead, each row must be retrieved individually by making calls to mysql_fetch_row().

MYSQL_RES

This structure represents the result of a query that returns rows (SELECT, SHOW, DESCRIBE, EXPLAIN). The information returned from a query is called the result set in the remainder of this section.

Source code version: mysql-5.1.21-beta

接著我們看 MySQL 源碼以印證手冊說明。

/include/mysql.h

line: 316-332, context of MYSQL_RES

typedef struct st_mysql_res {
  my_ulonglong  row_count;
  MYSQL_FIELD   *fields;
  MYSQL_DATA    *data;
  MYSQL_ROWS    *data_cursor;
  unsigned long *lengths;       /* column lengths of current row */
  MYSQL     *handle;        /* for unbuffered reads */
  const struct st_mysql_methods *methods;
  MYSQL_ROW row;            /* If unbuffered read */
  MYSQL_ROW current_row;        /* buffer to current row */
  MEM_ROOT  field_alloc;
  unsigned int  field_count, current_field;
  my_bool   eof;            /* Used by mysql_fetch_row */
  /* mysql_stmt_close() had to cancel this result */
  my_bool       unbuffered_fetch_cancelled;
  void *extension;
} MYSQL_RES;

MYSQL_RES 所包含的內容有二種情形。一種是 bufered, 另一種是 unbuffered 。剛好對應 mysql_store_result(), mysql_use_result() 的儲存策略。

/sql/client.c

繼續觀察 mysql_store_result() 的源碼內容。line: 2802-2844, context of mysql_store_result()

MYSQL_RES * STDCALL mysql_store_result(MYSQL *mysql)
{
  MYSQL_RES *result;
  DBUG_ENTER("mysql_store_result");
  /* read from the actually used connection */
  mysql = mysql->last_used_con;
  if (!mysql->fields)
    DBUG_RETURN(0);
  if (mysql->status != MYSQL_STATUS_GET_RESULT)
  {
    set_mysql_error(mysql, CR_COMMANDS_OUT_OF_SYNC, unknown_sqlstate);
    DBUG_RETURN(0);
  }
  mysql->status=MYSQL_STATUS_READY;     /* server is ready */
  if (!(result=(MYSQL_RES*) my_malloc((uint) (sizeof(MYSQL_RES)+
                          sizeof(ulong) *
                          mysql->field_count),
                      MYF(MY_WME | MY_ZEROFILL))))
  {
    set_mysql_error(mysql, CR_OUT_OF_MEMORY, unknown_sqlstate);
    DBUG_RETURN(0);
  }
  result->methods= mysql->methods;
  result->eof=1;                /* Marker for buffered */
  result->lengths=(ulong*) (result+1);
  if (!(result->data=
    (*mysql->methods->read_rows)(mysql,mysql->fields,mysql->field_count)))
  {
    my_free((uchar*) result,MYF(0));
    DBUG_RETURN(0);
  }
  mysql->affected_rows= result->row_count= result->data->rows;
  result->data_cursor=  result->data->data;
  result->fields=   mysql->fields;
  result->field_alloc=  mysql->field_alloc;
  result->field_count=  mysql->field_count;
  /* The rest of result members is bzeroed in malloc */
  mysql->fields=0;              /* fields is now in result */
  clear_alloc_root(&mysql->field_alloc);
  /* just in case this was mistakenly called after mysql_stmt_execute() */
  mysql->unbuffered_fetch_owner= 0;
  DBUG_RETURN(result);              /* Data fetched */
}

line: 1309-1312, context of cli_read_rows()

/* Read all rows (fields or data) from server */

MYSQL_DATA *cli_read_rows(MYSQL *mysql,MYSQL_FIELD *mysql_fields,
              unsigned int fields)

源碼內容顯示 mysql_store_result() 會註記資料結果為 "buffered"，並清除 "unbuffered" 策略的相關內容。藉由函數指標 read_rows 調用 cli_read_rows() ，將資料結果集讀取至 client 。

結論

在前一系列的討論中，我們的出發點是 CakePHP 的資料庫存取行為，因為 CakePHP 使用的是 mysql_query()，而非 mysql_unbuffered_query()，所以我們對資料結果集的記憶體配置方式，也是針對 mysql_query()。事實上，絕大多數的案例中， PHP 程序員使用的都是 mysql_query() 而非 mysql_unbuffered_query()。

我在稍後的回應中也說明，PHP 的 MySQL 函數，可以區分2種儲存策略 (即mysql_query() 與 mysql_unbuffered_query())。不過 FIEND 顯然忽視這點，隻字未提mysql_unbuffered_query()，始終堅持他的認知: "資料結果應該是儲存在 DB 端"。亦即他認為只有一種策略。

Ok, 也許資料庫管理與規劃書籍介紹的策略只有一種，但那是通則、是理論。但此處討論的卻是 PHP 查詢 MySQL 的實例。我說明的內容，並不是我的個人認知，而是程式碼明擺著的事實。如果 FIEND 還要說誰的 DB觀念不好，在誤導別人，請去對 PHP 和 MySQL 的開發團隊說吧。

樂多舊網址: http://blog.roodo.com/rocksaying/archives/4986803.html

樂多舊回應

moby211@hotmail.com(Moby) (#comment-16155695)

Fri, 04 Apr 2008 19:15:34 +0800

有兩個問題請教：
1.使用mysql_unbuffered_query()，是否會造成你在討論串中所說的，佔用mysql server的記憶體？

2.到底mysql做換頁功能的最好方法是什麼？先用limit，再用不含limit的count嗎？

未留名 (#comment-16177207)

Tue, 08 Apr 2008 10:46:53 +0800

1. MySQL 本身必然要配置記憶體(mysql server的記憶體)儲存查詢結果，這不需多談。
本文談的是 php 的記憶體使用(application server的記憶體)。

2.最適策略視資料量而定。
有時，做 results cache 是比較好的策略。