瞧那部落格: Python Pecan Web Framework Tutorial (1)

pecan

前言

一直在找一個優秀的好用的web server framework。目前用過WSGI, Tornado，Flask，Golang。
除了好用，能夠減少開發的時間外，重要的是效能與blocking的問題。目前用過最好的應該就是Golang，效能好，沒有Blocking的問題，但開發上除了不同語言外，開發還是需要比較多重功的。

之前最滿意的還是WSGI，因為有OpenStack可以模仿，所以自己也開發了一套Framework，原本想寫個文章來講一下，但太複雜了，需要很大的篇幅才能介紹完。最近又發現了一個好物，Pecan，這也是OpenStack，Ceilometer，Magnum，Ironic都使用的Web Framework。因此，我特別玩了一下，我發現，這非常的滿足我想要的Framework，因此，我就不介紹WSGI的framework，改介紹Pecan。

好用，減少開發時間，不存在block問題(不需要eventlet的額外設定)，不需要特別設定router與dispatcher，pecan與controller的關係是很符合開發直覺。並且還能與paste整合，達成middleware的功效。一句話，完美。

pecan套件安裝

安裝方法很簡單，下載源碼。

git clone https://github.com/pecan/pecan.git

用Python安裝套件的方式安裝

cd pecan
python setup.py install

完成了。

最基礎的Pecan

此部分的程式，在我的GitHub中

https://github.com/jonahwu/lab/tree/master/pecan/pecantest/testproject_ori

先產生一個專案。我們來看一下整個目錄架構。

pecan create test_project
cd test_project
python setup.py develop

用develop可以在目前的目錄下修改並直接執行，對一個剛開始開發的專案，推薦用這個方式。
如果用 python setup.py install會直接安裝到/usr/local/lib/python2.7/dist-packages/，每此修改完還需要再執行安裝才生效。

root@ubuntu:~/pecan_test/test_project_ori# tree
.
├── build
│   ├── bdist.linux-x86_64
│   └── lib.linux-x86_64-2.7
│       └── test_project
│           ├── app.py
│           ├── controllers
│           │   ├── __init__.py
│           │   └── root.py
│           ├── __init__.py
│           ├── model
│           │   └── __init__.py
│           └── tests
│               ├── config.py
│               ├── __init__.py
│               ├── test_functional.py
│               └── test_units.py
├── config.py
├── config.pyc
├── dist
│   └── test_project-0.1-py2.7.egg
├── MANIFEST.in
├── public
│   ├── css
│   │   └── style.css
│   └── images
│       └── logo.png
├── setup.cfg
├── setup.py
├── test_project
│   ├── app.py
│   ├── controllers
│   │   ├── __init__.py
│   │   ├── __init__.pyc
│   │   ├── root.py
│   │   └── root.pyc
│   ├── __init__.py
│   ├── __init__.pyc
│   ├── model
│   │   ├── __init__.py
│   │   └── __init__.pyc
│   ├── templates
│   │   ├── error.html
│   │   ├── index.html
│   │   └── layout.html
│   └── tests
│       ├── config.py
│       ├── __init__.py
│       ├── test_functional.py
│       ├── test_units.py
│       └── test_units.pyc

pecan server執行

執行pecan serve config.py，結果如下

root@ubuntu:~/pecan_test/test_project_ori# pecan serve config.py


Starting server in PID 1968
serving on 0.0.0.0:8080, view at http://127.0.0.1:8080

其中最重要的檔案是

test_project/config.py
test_project/controllers/root.py

config.py包含了server的設定與Controller的起始位置。
在Advanced Pecan的章節中，我們會看到Pecan Controller這部分的強大，所有的request都可以從這裡當起點後做開發，非常的直覺。

# Server Specific Configurations
server = {
    'port': '8080',
    'host': '0.0.0.0'
}

# Pecan Application Configurations
app = {
    'root': 'test_project.controllers.root.RootController',
    'modules': ['test_project'],
    'static_root': '%(confdir)s/public',
    'template_path': '%(confdir)s/test_project/templates',
    'debug': True,
    'errors': {
        404: '/error/404',
        '__force_dict__': True
    }
}

其中root是最重要的概念，所有request會從這裡開始，我們現在使用的Controller放在root.py中的RootController的Class中。

from pecan import expose, redirect
from webob.exc import status_map

class RootController(object):

    @expose(generic=True, template='index.html')
    def index(self):
        return dict()

    @index.when(method='POST')
    def index_post(self, q):
        redirect('http://pecan.readthedocs.org/en/latest/search.html?q=%s' % q)

    @expose('error.html')
    def error(self, status):
        try:
            status = int(status)
        except ValueError:  # pragma: no cover
            status = 500
        message = getattr(status_map.get(status), 'explanation', '')
        return dict(status=status, message=message)

我們可以透過curl的方式了解這些訪問發生了什麼事。

curl http://localhost:u8080/index
or 
curl http://localhost:8080

{"versions": {"values": [{"status": "stable", "updated": "2013-02-13T00:00:00Z", "media-types": [{"base": "application/json", "type": "application/vnd.openstack.telemetry-v2+json"}, {"base": "application/xml", "type": "application/vnd.openstack.telemetry-v2+xml"}], "id": "v2", "links": [{"href": "http://172.16.235.157:8080/v2", "rel": "self"}, {"href": "http://docs.openstack.org/", "type": "text/html", "rel": "describedby"}]}]}}

request會直接訪問index函數。因此，針對此request我們只要修改index函數滿足我們要的結果即可。

其中decrorator expose函數如果宣告了，表示，request可以訪問此函數，反之，則為內部函數。因此，如果忘記expose會導致，無法訪問，這部分，我還犯了很多次錯誤在這部分。更多的使用方始可以看下一章節Advanced Pecan。

Advanced Pecan

此部分的程式在我的GitHub中

https://github.com/jonahwu/lab/tree/master/pecan/pecantest/testproject

主要參考了ceilometer中的api server的程式架構所改的。

我們將原本的程式架構做了些改變

root@ubuntu:~/pecan_test/test_project# tree
.
├── api_paste.ini
├── api_paste.ini_bak
├── api.py
├── app_eventlet.py
├── app.py
├── app.pyc
├── app.wsgi
├── build
│   ├── bdist.linux-x86_64
│   └── lib.linux-x86_64-2.7
│       └── test_project
│           ├── app.py
│           ├── controllers
│           │   ├── __init__.py
│           │   └── root.py
│           ├── __init__.py
│           ├── model
│           │   └── __init__.py
│           └── tests
│               ├── config.py
│               ├── __init__.py
│               ├── test_functional.py
│               └── test_units.py
├── config.py
├── config.pyc
├── dist
│   └── test_project-0.1-py2.7.egg
├── healthcheck.py
├── healthcheck.pyc
├── __init__.py
├── __init__.pyc
├── MANIFEST.in
├── public
│   ├── css
│   │   └── style.css
│   └── images
│       └── logo.png
├── readme
├── setup.cfg
├── setup.py
├── test_project
│   ├── app.py
│   ├── app.pyc
│   ├── controllers
│   │   ├── __init__.py
│   │   ├── __init__.pyc
│   │   ├── root.py
│   │   ├── root.py_bak
│   │   ├── root.pyc
│   │   └── v2
│   │       ├── __init__.py
│   │       ├── __init__.pyc
│   │       ├── meters.py
│   │       ├── meters.pyc
│   │       ├── query2.py
│   │       ├── query2.pyc
│   │       ├── query.py
│   │       ├── query.pyc
│   │       ├── root.py
│   │       └── root.pyc
│   ├── __init__.py
│   ├── __init__.pyc

這部分我們結合了

v2 Controller
Method
增加endpoint訪問
輸入輸出型別控制
lookup function控制url path的訪問
Request Infomation
Paste
middleware
non-blocking設定

v2 Controller

我們看一下之前提到過最重要的controllers中的root.py

import pecan

from test_project.controllers.v2 import root as v2

MEDIA_TYPE_JSON = 'application/vnd.openstack.telemetry-%s+json'
MEDIA_TYPE_XML = 'application/vnd.openstack.telemetry-%s+xml'


class RootController(object):

    v2 = v2.V2Controller()

    @pecan.expose('json')
    def index(self):
        base_url = pecan.request.host_url
        available = [{'tag': 'v2', 'date': '2013-02-13T00:00:00Z', }]
        collected = [version_descriptor(base_url, v['tag'], v['date'])
                     for v in available]
        versions = {'versions': {'values': collected}}
        return versions

其中有v2的部分，描述了所有的endpoint都指向v2目錄中的root.py

from test_project.controllers.v2 import meters
from test_project.controllers.v2 import query
from test_project.controllers.v2 import query2




class V2Controller(object):
    """Version 2 API controller root."""

    meters = meters.MetersController()
    query = query.QueryController()
    query2 = query2.QueryController2()

這裏宣告了所有能夠訪問的url，也就是meters，query，query2，的endpoints。

http://localhost:8080/meters
http://localhost:8080/query
http://localhost:8080/query2

當然，我們必須開發實際的controller。我們產生一個檔案meters.py

from pecan import rest
import pecan

class MetersController(rest.RestController):
    """Works on meters."""

    @pecan.expose()
    def get(self):
        print 'hahah'
        return "haha

expose()，表示將此函式設為可被訪問，get表示為method為GET，並非表示endpoint為GET，等一下我們會在介紹如何加入新的endpoint。

Method

我們可以加入 put，delete，post的method於此段程式中。並透過以下request測試

curl -X GET http://localhost/meters
curl -X POST http://localhost/meters
curl -X PUT http://localhost/meters
curl -X DELETE http://localhost/meters

增加endpoint訪問

我們再開發實際的controller。我們產生一個檔案query.py。

from pecan import rest, request
import pecan
from wsmeext.pecan import wsexpose


class QueryController(rest.RestController):

    # we need to define the test function
    _custom_actions = {
        'testla': ['GET'],
        'input': ['GET'],
    }
   
    @pecan.expose()
    def get(self):
        print 'return query'
        return "query get"
    
    @pecan.expose()
    def post(self):
        return "query post"

    @pecan.expose()
    def testla(self):
        return "fadsfasdf"

    @wsexpose(unicode, int, unicode)
    def input(self, number, check):
        print number
        print self.user_id
        return 'aaa'

我們除了支持GET與POSTmethod外，我們增加了一個endpoint testla與input。除了開發此函式外，我們還得透過customactions，先宣告各自的method，如果request不滿足則此method，將會被拒絕。

輸入輸出型別控制

如上input可以透過wsexpose定義輸入輸出型別。此時就不需要再定義@expose()了。
其中@wsexpose(output Type, input1 Type, input2 Type)。

試試看request

    curl 'localhost:8080/v2/query/input?number=3&check=aaa'

其中query的a=b的b各自對應了input1與input2，亦即分別為3與aaa。
請注意需要curl 'xxxx'的輔助，如果沒有**'**則輸入兩個query時會錯。

lookup function控制url path的訪問

我們再開發實際的controller。我們產生一個檔案query2.py 我們將討論Pecan定義的lookup函式，這函式簡單的描述了，將不同path subrouting到不同Controller。我們用以下範例來說明。

  from pecan import rest, request
  import pecan
  from wsmeext.pecan import wsexpose
  
  class QueryController2(rest.RestController):
   
    @pecan.expose()
    def _lookup(self, user_id, *remainder):
        if user_id:
            print ' into next controller'
            return NextController(user_id), remainder

    @pecan.expose()
    def get(self):
        print 'return query'
        return "query2 get"

    @pecan.expose()
    def post(self):
        return "query2 post"

此controller如果使用 curl http://localhost:8080/query2，則會走到GET函數中。如果這條request無法滿足我們的需求，比如:

curl http://localhost:8080/query2/user_id

這是很常見的request型態，user_id同常會是個uuid。此時，將會走入_lookup這個函式中。而我們將此request倒到新的controller NextController如下，

   class NextController(rest.RestController):
    _custom_actions = {
        'name': ['GET'],
        'methodget': ['GET','POST'],
    }
    def __init__(self, user_id=None):
        print 'into next init'
        print 'the userid is %s'%user_id
        self.user_id = user_id

    @pecan.expose()
    def get(self):
        return 'next controller get routing'

    @pecan.expose()
    def methodget(self):
        print 'the header is %s'%pecan.request.headers.get('testheader')
        print 'the method is %s'%pecan.request.method
        print 'the data is %s'%pecan.request.body
        print 'the host is %s'%pecan.request.host
        print 'the param is %s'%pecan.request.params.get('aaa')
        print 'the query is %s'%pecan.request.query_string
        print 'the path is %s'%pecan.request.path
        print 'the path_info is %s'%pecan.request.path_info
        print 'the path_qs is %s'%pecan.request.path_qs

        return 'get method ok'

    @pecan.expose()
    def name(self):
        strcmd='dd if=/dev/urandom of=/root/%s bs=1G count=10'%self.user_id
        commands.getstatusoutput(strcmd)
        return 'next controller name routing'

我們可以用下列的request來試試看

   curl http://localhost:8080/query2/aaa-bbb-ccc

上面的request將會走到GET函式。此時，我們也可以取得aaa-bbb-ccc透過self.userid得知。

接下來我們試試看另一個request。

   curl http://localhost:8080/query2/aaa-bbb-ccc/name

則上面的reqeust將會走到name函式中。

我們在name函式中，刻意用了dd一個大檔案，這是有實驗目的的，等一下做non-blocking的測試會用到。

Request Infomation

以下為client reqeust

curl -H "testheader:aaa" 'http://localhost:8080/v2/query2/aaa/methodget?aaa=1&bbb=2' -d "this is data"

methodget提供了取得request的方法。值得說明的有兩點 1. 是在Pecan裡要傳body是必須用POST。 2. 要查pecan.request所提供的function，要查pecan.Request。

結果為

the header is aaa
the method is POST
the data is this is data
the host is localhost:8080
the param is 1
the query is aaa=1&bbb=2
the path is /v2/query2/aaa/methodget
the path_info is /v2/query2/aaa/methodget
the path_qs is /v2/query2/aaa/methodget?aaa=1&bbb=2

Paste

Paste是一個對programmer而言非常棒的東西，可以重新定義程式的架構，比如middleware的支持，讓整個程式架構充滿彈性。

在test_project目錄下，我們加入了app.py，我們將不再使用pecan內定的執行server啟動的方式，並且重新定義server啟動。

import logging
import os
import pecan
from paste import deploy
import config as api_config
from werkzeug import serving

def setup_app(pecan_config=None, extra_hooks=None):
    app = pecan.make_app(
        pecan_config.app.root,
        debug=True,
    )
    return app


def get_pecan_config():
    filename = api_config.__file__.replace('.pyc', '.py')
    return pecan.configuration.conf_from_file(filename)


class VersionSelectorApplication(object):
    def __init__(self):
        pc = get_pecan_config()
        self.v2 = setup_app(pecan_config=pc)

    def __call__(self, environ, start_response):
        if environ['PATH_INFO'].startswith('/v1/'):
            return self.v1(environ, start_response)
        return self.v2(environ, start_response)

def load_app():
    cfg_file = None
    cfg_path = '/root/pecan_test/test_project/api_paste.ini'
    if not os.path.isabs(cfg_path):
        cfg_file = CONF.find_file(cfg_path)
    elif os.path.exists(cfg_path):
        cfg_file = cfg_path

    if not cfg_file:
        raise cfg.ConfigFilesNotFoundError([cfg.CONF.api_paste_config])
    print 'api paste file %s'%cfg_file
    return deploy.loadapp("config:" + cfg_file)


# you must apply body for post method such as curl -X POST 'http://localhost:8080/v2/query2' -d "dddd"
def build_server():
    print 'start to build paste server'

    app = load_app()
    host = '0.0.0.0'
    port = '8080'
    if host == '0.0.0.0':
        print 'port %s'%port
    else:
        print 'host %s'%host

    workers=1
    serving.run_simple(host, port,
                      app, processes=workers, threaded=1000)

def app_factory(global_config, **local_conf):
    print 'into app_factory'
    return VersionSelectorApplication()

其中 pecan_config亦即config.py檔案，在透過標準pecan serve執行時需要指定這個檔案，同樣的，我們換一個方式執行server時，也肯定需要此檔案的。

我們將paste的config檔，直些寫入程式碼中，這點，想要執行的人，可能要稍作修改，cfgpath = '/root/pecantest/testproject/apipaste.ini'。

剛有提到我們將換另一個方式執行server的啟動。我們寫了一個api.py，透過執行他來啟動server。啟動server的方法是透過werkzeug這個套件。

import app
def main():
    #service.prepare_service()
    print ' start to run server'
    app.build_server()

main()

執行結果

root@ubuntu:~/pecan_test/test_project# python api.py


 start to run server
start to build paste server
api paste file /root/pecan_test/test_project/api_paste.ini
into app_factory
port 8080
 * Running on http://0.0.0.0:8080/ (Press CTRL+C to quit)

middleware

我們來看看，驅動paste啟動整個程式啟動的另一個關鍵Configuration file api_paste.ini。

[pipeline:main]
pipeline = authtoken api-server

[app:api-server]
paste.app_factory = app:app_factory

[filter:authtoken]
paste.filter_factory = healthcheck:filter_factory

[filter:request_id]
paste.filter_factory = oslo.middleware:RequestId.factory

此config描述，一個request會經過一個pipeline，分別是authtoken，接下來是，api-server。其中authtoken是healthcheck.py，api-server是app.py，因此我們可以重新定義request的流，經過多少函式，此即為middleware的概念。

大家可查閱一下healthcheck.py的內容，我就不再多做介紹了。

non-blocking設定

在app.py中，我們啟動web server服務時有以下這段程式。

   workers=1
   serving.run_simple(host, port,
                    app, processes=workers, threaded=1000)

如果兩者皆設定1時，或者default值，concurrency為1，亦即，當一個request尚未處理完時，下個request需要等待執行完畢才會輪到他。當然，這絕對不是正確的設定，換句話說，request被block了。為了做這樣的測試，我們在queryController2中的namefunction就執行了dd if=/dev/urandom of=/root/uuid bs=1G count=10。此動作大約會讓request block約15秒左右，以便做為block測試。

當我們設定worker=1，threaded為100，我們得到了non-block的結果。以下為官方的定義: * threaded – should the process handle each request in a separate thread? * processes – if greater than 1 then handle each request in a new process up to this maximum number of concurrent processes.

我並沒有真的測試這樣的concurrency是如何，因為在做這樣的實驗時，把我的VM搞掛了，因此我就不再做此實驗了。在2CPU的環境，測試結果如下此結果為5個concurrency，當然，我試過更多的，但導致VM掛了，因此實驗，我就點到為止了。

Tasks: 418 total,   7 running, 411 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.2 us, 99.8 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:    487204 total,   480168 used,     7036 free,     8760 buffers
KiB Swap:  1046524 total,        0 used,  1046524 free.   171796 cached Mem

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
  1620 root      20   0 1054536  34552   1680 R  41.7  7.1   0:25.14 dd
  1615 root      20   0 1054536  34556   1684 R  41.4  7.1   0:27.32 dd
  1625 root      20   0 1054536  34524   1648 R  39.1  7.1   0:22.71 dd
  1631 root      20   0 1054536  34312   1572 R  39.1  7.0   0:08.27 dd
  1610 root      20   0 1054536  34532   1656 R  37.7  7.1   0:29.43 dd

我相信，這樣的設定，基本上，已經滿足我們的需求了。等未來更多的使用，來提供這些設定的優化吧。

另一個好處是，在這過程中我也試著結合eventlet的結果，我發現是一樣的，所以我目前認為，Pecan不需要eventlet搭配concurrency。

Pecan設置threaded and processes奇怪之處

from source code of wekzeug https://github.com/pallets/werkzeug/blob/master/werkzeug/serving.py#L534 the thread and processes 是很奇怪的。 threads表示process中的request handle數量，process表示fork的數量。但threaded and processes不能同時大於1。這可能的說法是process只佔用一個CPU threaded在此CPU分配多個concurrncy request。但此猜測與我的實驗結果不同。我的實驗結果是只要threaded大於1時(processes must =1)，CPU分配是多個的，並且concurrecy超過threaded的設定。並且全CPU運作。這個不合理的現象，可能是我誤解錯誤，也有可能是跑在VM的關係？不過我不想研究這個課題了，暫時能用就好。

以下為查閱了werkzeug的程式，我所認知的與實驗結果不同，這點有空我還需要再度確認一下。

class ThreadedWSGIServer(ThreadingMixIn, BaseWSGIServer):

    """A WSGI server that does threading."""
    multithread = True


class ForkingWSGIServer(ForkingMixIn, BaseWSGIServer):

    """A WSGI server that does forking."""
    multiprocess = True

    def __init__(self, host, port, app, processes=40, handler=None,
                 passthrough_errors=False, ssl_context=None, fd=None):
        BaseWSGIServer.__init__(self, host, port, app, handler,
                                passthrough_errors, ssl_context, fd)
        self.max_children = processes
        
      



def make_server(host=None, port=None, app=None, threaded=False, processes=1,
                request_handler=None, passthrough_errors=False,
                ssl_context=None, fd=None):
    """Create a new server instance that is either threaded, or forks
    or just processes one request after another.
    """
    if threaded and processes > 1:
        raise ValueError("cannot have a multithreaded and "
                         "multi process server.")
    elif threaded:
        return ThreadedWSGIServer(host, port, app, request_handler,
                                  passthrough_errors, ssl_context, fd=fd)
    elif processes > 1:
        return ForkingWSGIServer(host, port, app, processes, request_handler,
                                 passthrough_errors, ssl_context, fd=fd)
    else:
        return BaseWSGIServer(host, port, app, request_handler,
                              passthrough_errors, ssl_context, fd=fd)

額外說一下，Class可以當Function用並且還提供更多Class所擁有的功能，實在超好用的。

我已經找到目前我覺得最好用的web framework了，就是Pecan，有OpenStack的廣大開源者支持，Pecan還會再進化吧。

瞧那部落格

Saturday, May 14, 2016

Python Pecan Web Framework Tutorial (1)

前言