• 周日. 11月 27th, 2022

5G编程聚合网

5G时代下一个聚合的编程学习网

热门标签

Framework: see Linux high performance network IO + reactor model

[db:作者]

1月 6, 2022

Preface

The Internet I/O, It can be understood as data flow on the network . Usually we’re based on socket Create a link with the remote end TCP perhaps UDP passageway , Then read and write . Single socket when , Using one thread can efficiently handle ; But if it is 10K individual socket Connect , Or more , How do we do high performance processing ?

  • Introduction to basic concepts
  • The Internet I/O The process of reading and writing
  • linux The next five networks I/O Model
  • Multiplexing I/O Understand a wave of
  • Reactor Model
  • Proacotr Model

Official account , Communicate together : Sneak forward

github Address , thank star

Introduction to basic concepts

  • process ( Threads ) Switch

    * All systems have the ability to schedule processes , It can suspend a currently running process , And resume the previously suspended process 
  • process ( Threads ) The block

    * Running process , Sometimes you wait for the execution of other events to complete , Like waiting for a lock , request I/O Read and write ; The waiting process will be blocked by the automatic execution of the system , At this point, the process does not occupy CPU
  • File descriptor

    * stay Linux, A file descriptor is an abstract concept used to express a reference to a file , It's a nonnegative integer . When the program opens an existing file or creates a new file , The kernel returns a file descriptor to the process 
  • linux signal processing

    * Linux A process can accept signal values from the system or process when it is running , Then according to the signal value to run the corresponding capture function ; The signal is a software simulation of a hardware interrupt
    

In the chapter of zero copy mechanism User space and kernel space and buffer , It’s omitted here

The Internet IO The process of reading and writing

  • When launching on user space socket Socket read operation , Will cause context switching , User process blocking (R1) Waiting for the network data stream to arrive , Copy from NIC to kernel ;(R2) Then copy from the kernel buffer to the user process buffer . At this point, the process switch is restored , Processing the data you get
  • Here we give socket The first stage of the read operation has an alias R1, The second stage is called R2
  • When launching on user space socket Of send In operation , Causes context switching , User process blocking wait (1) Data is copied from the user process buffer to the kernel buffer . data copy complete , At this point, the process switch is restored

linux Five kinds of networks IO Model

Blocking type I/O (blocking IO)

ssize_t recvfrom(int sockfd,void *buf,size_t len,unsigned int flags, struct sockaddr *from,socket_t *fromlen);

  • The most basic I/O The model is blocking I/O Model , It’s also the simplest model . All operations are performed sequentially
  • Blocking IO In the model , User space applications perform a system call (recvform), Will cause the application to be blocked , Until the data in the kernel buffer is ready , And copy the data from the kernel to the user process . Finally, the process is awakened by the system to process data
  • stay R1、R2 Two successive stages , The whole process is blocked

Non-blocking type I/O (nonblocking IO)

  • Non blocking IO It’s also a kind of synchronization IO. It’s based on polling (polling) Mechanism realization , In this model , The socket is opened in a non blocking form . That is to say I/O The operation will not be completed immediately , however I/O The operation will return an error code (EWOULDBLOCK), Prompt operation not completed
  • Polling for kernel data , If the data is not ready , Then return to EWOULDBLOCK. The process goes on and on recvfrom call , Of course, you can pause to do something else
  • Until the kernel data is ready , Then copy the data to user space , Then the process gets the non error code data , Then data processing . We need to pay attention to , The whole process of copying data , The process is still in a blocked state
  • The process is in R2 Phase blocking , Although in R1 The stage is not blocked , But you need to keep polling

Multiplexing I/O (IO multiplexing)

  • Generally, there will be a large number of socket Connect , If you can query the read and write status of multiple sockets at a time , If any one is ready , Then deal with it , It’s a lot more efficient . This is it. “I/O Multiplexing ”, Multichannel means more than one socket Socket , Reuse means reusing the same process
  • linux Provides select、poll、epoll Wait for multiplexing I/O How to implement
  • select or poll、epoll It’s a blocking call
  • And blocking IO Different ,select Not until socket Data all arrive and then process , It’s a part of it socket When the data is ready, the user process will be resumed for processing . How to know that part of the data is ready in the kernel ? answer : Leave it to the system
  • The process is in R1、R2 Stages are also blocking ; But in the R1 There’s a trick in the stage , In multiple processes 、 In the environment of multithreading programming , We can assign only one process ( Threads ) To block calls select, Other threads can be liberated

Signal driven I/O (SIGIO)

  • Need to provide a signal capture function , And on and on socket Socket Association ; launch sigaction Once called, the process can be freed to handle other things
  • When the data is ready in the kernel , The process will receive a SIGIO The signal , Then interrupt to run the signal capture function , call recvfrom Read data from the kernel into user space , Reprocessing data
  • You can see that the user process is not blocked in R1 Stage , but R2 It’s still blocking waiting

asynchronous IO (POSIX Of aio_ Series of functions )

  • Relative synchronization IO, asynchronous IO Initiate an asynchronous read in the user process (aio_read) After the system call , Whether the kernel buffer data is ready or not , Will not block the current process ; stay aio_read After the system call returns, the process can process other logic
  • socket When the kernel is ready , The system copies data directly from the kernel to user space , The user process is then signaled
  • R1、R2 In both phases, the process is non blocking

Multiplexing IO Understand a wave of

select

int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout);
  • 1) Use copy_from_user Copy from user space fd_set To kernel space
  • 2) Register callback function __pollwait
  • 3) Traverse all of fd, Call its corresponding poll Method ( about socket, This poll The method is sock_poll,sock_poll Call the tcp_poll,udp_poll perhaps datagram_poll)
  • 4) With tcp_poll For example , Its core implementation is __pollwait, That is, the callback function registered above
  • 5)\__pollwait Our main job is to put current( The current process ) Hang to the device’s waiting queue , Different devices have different waiting queues , about tcp_poll Come on , The waiting queue is sk->sk_sleep( Note that hanging a process in the wait queue does not mean that the process is sleeping ). A message is received at the device ( Network devices ) Or fill in the file data ( Disk device ) after , Will wake up the process waiting for sleep on the queue , At this time current I was awakened
  • 6)poll Method returns a statement describing whether the read and write operations are ready mask Mask , According to this mask Mask to fd_set assignment
  • 7) If you go through all of fd, It has not returned a readable / writable mask Mask , It will call schedule_timeout Is to call select The process of ( That is to say current) Go to sleep
  • 8) When the device driver reads and writes its own resources , Will wake up the process waiting for sleep on the queue . If it exceeds a certain time limit (timeout Appoint ), No one wakes up , Call select The process will be awakened again CPU, And then go through it again fd, Judge whether there is a ready fd
  • 9) hold fd_set Copy from kernel space to user space

select The shortcomings of

  • Every time you call select, All need to put fd Sets are copied from user state to kernel state , The cost is in fd A lot of times it’s big
  • Every call at the same time select You need to traverse all the passed in the kernel fd, The cost is in fd A lot of times it’s big
  • select The number of file descriptors supported is too small , The default is 1024

epoll

int epoll_create(int size);
int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event);
int epoll_wait(int epfd, struct epoll_event *events,int maxevents, int timeout); 
  • call epoll_create, It’s going to be in the kernel cache Li Jian Red and black trees For storage later epoll_ctl From the socket, At the same time, it will create another rdllist Double linked list Used to store events that are ready . When epoll_wait Invocation time , Just look at this rdllist Two way linked list data can be
  • epoll_ctl In the epoll Add to object 、 modify 、 When deleting an event , Is in rbr Operating in red and black trees , Very fast
  • Add to epoll Events in will be associated with devices ( Such as network card ) Establish a callback relationship , The callback method is called when the corresponding event occurs on the device , Add events to rdllist In a two-way list ; This callback method in the kernel is called ep_poll_callback

epoll Two trigger modes of

  • epoll Yes EPOLLLT and EPOLLET Two trigger modes ,LT It’s the default mode ,ET yes “ High speed ” Pattern ( Only support no-block socket)

    * LT( Level trigger ) In mode , As long as the file descriptor has data to read ,** Every time epoll_wait Will trigger its read event **
    * ET( Edge trigger ) In mode , Yes detected I/O When an event is , adopt epoll_wait Call to get a file descriptor with event notification , For file descriptors , As readable , The file descriptor must be read all the way to empty ( Or return EWOULDBLOCK),** Or next time epoll_wait This event will not be triggered **
    

epoll comparison select The advantages of

  • solve select Three disadvantages

    * ** For the first disadvantage **:epoll The solution is epoll_ctl Function . Every time you register a new event to epoll When in the handle ( stay epoll_ctl It is specified in EPOLL_CTL_ADD), Will take all of fd Copy into the kernel , Not in epoll_wait Duplicate copies when .epoll Guaranteed every fd It will only be copied once in the whole process (epoll_wait There is no need to duplicate )
    * ** For the second disadvantage **:epoll For each fd Specify a callback function , When the device is ready , When waking up the waiters on the waiting queue , I'll call this callback function , And this callback function will be ready to fd Add a ready list .epoll_wait In fact, you can check whether there are ready ones in the ready list fd( No traversal required )
    * ** For the third disadvantage **:epoll There is no such restriction , It supports FD The upper limit is the maximum number of files that can be opened , The number is generally much higher 2048, for instance , stay 1GB Memory on the machine is approximately 10 All around , In general, this number has a lot to do with system memory 
  • epoll A high performance

    * epoll Red black tree is used to save the file descriptor events that need to be monitored ,epoll_ctl The operation of adding, deleting and modifying is fast
    * epoll You don't need to traverse to get ready fd, Return to the ready list directly
    * linux2.6 Then I used mmap technology , Data no longer needs to be copied from the kernel to user space , Zero copy
    

About epoll Of IO The model is a question of synchronous asynchrony

  • Concept definition

    * Sync I/O operation : Cause the request process to block , until I/O Operation is completed
    * asynchronous I/O operation : Does not cause the request process to block , Asynchrony only deals with I/O Notification after the operation is completed , Not actively reading and writing data , Read and write the data by the system kernel
    * Blocking , Non blocking : process / Whether the data to be accessed by the thread is ready , process / Whether the thread needs to wait 
  • asynchronous IO The idea is to ask for no blocking I/O call . It was introduced that I/O The operation is divided into two stages :R1 Wait for the data to be ready .R2 Copy data from the kernel to the process . although epoll stay 2.6 The kernel is followed by mmap Mechanism , Make it in R2 Stages don’t need to be copied , But it is in R1 It’s still blocked . So it’s classified as synchronous IO

Reactor Model

Reactor The central idea is to deal with everything I/O Event registration to a center I/O On the multiplexer , At the same time, the main thread / The process is blocked on the multiplexer ; Once you have I/O Events come or are ready , The multiplexer returns , And will be registered in advance I/O Events are distributed to the corresponding processor

Introduction to related concepts :

  • event : It’s state ; such as : Read ready event Refers to the state in which we can read data from the kernel
  • Event separator : The waiting for an event to happen is usually handed over to epoll、select; And events come at random , Asynchronous , So you need to call… In a loop epoll, The corresponding encapsulated module in the framework is the event separator ( To understand simply is to say to epoll encapsulation )
  • Event handler : After the event occurs, the process or thread is needed to handle , This handler is the event handler , General and event separators are different threads

    Reactor The general process of

  • 1) The application is Event separator register Read write ready event and Read write ready event handler
  • 2) The event separator waits for the read / write ready event to occur
  • 3) Read write ready event occurs , Activate the event separator , Separator calls read / write ready event handler
  • 4) The event handler first reads data from the kernel into user space , And then process the data

Single thread + Reactor

Multithreading + Reactor

Multithreading + Multiple Reactor

Proactor The general flow of the model

  • 1) The application registers with the event separator Read the completion event and Read complete event handler , And send an asynchronous read request to the system
  • 2) The event separator waits for the read event to complete
  • 3) In the process of separator waiting , The actual operation of the kernel is executed by the read thread of the kernel , And copy the data to the process buffer , Finally, the event separator is informed of the arrival of read completion
  • 4) The event separator is listening for Read the completion event , Activate The processor that reads the completion event
  • 5) The read completion event handler processes the data in the user process buffer directly

    Proactor and Reactor The difference between

  • Proactor It’s based on asynchrony I/O The concept of , and Reactor It’s usually based on multiplexing I/O The concept of
  • Proactor There is no need to copy data from the kernel to user space , This is done by the system

Welcome refers to a mistake in the text

Reference article

  • Chat Linux Five kinds IO Model
  • The Internet io Model
  • The Internet IO
  • 5 Seed network IO Model
  • epoll A detailed explanation of the principles and epoll Reactor model

发表回复

您的电子邮箱地址不会被公开。 必填项已用*标注