2014年12月9日 星期二

How you receive an ethernet packet: from the ethernet driver to the BSD socket.


I've worked with lots of Ethernet driver developers and network applicaton engineers in the past few years.
It surprises me how one person might have dedicated years of programming into writing Ethernet MAC drivers and BSD sockets, but never bothered to know how the data is passed from the hw driver through the IP stack to the BSD socket API.

I wanted to share this because I think some times it is better to know the whole picture.

p.s.
I've referenced the kernel source code, which I was currently working on; therefore, the function & variable names might differ in various kernels, however the idea is the same.

OVERVIEW:

[HW driver] --> lookup packet type(ipv4? ipv6? ...etc) --> lookup destination (do route, drop, or input ...) [if(input)]--> lookup protocol(tcp, udp, icmp ... etc) -->lookup socket --> copy to user_space

PATH:

net device driver calls "netif_receive_sk()"

netif_receive_sk(){

    //1. check and return struct packet_type *
    //2. exec packet_type->function
    //
    // example:
    //    returned : struct packet_type ipv6_packet_type{
    //            .func = ipv6_rcv(struct sk_buff)
    //            }
    //
    //    struct sk_buff{
    //            struct dst_entry ._skb_dst = ipv6_input()
    //    }
   
    packet_type->func();
}

                                                                                                                                                   
ipv6_rcv()
{
    //1. look up the destination(returns a "struct dst_entry *")
    //2. calls ip6_rcv_finish()
    NF_INET_PRE_ROUTING(ip6_rcv_finish())
}

ip6_rcv_finish(){
    // 1. exec dst_entry->input() by calling ip6_route_input()
    //
    //example:
    //dstinsation is a ipv6 destinaition
    //
    //struct dst_entry
    //{
    //    .intput = ip6_input
    //}
    //
    ip6_route_input() //ip6_route_input() looks up all possible routes if(input){est_entry->input()}
   
}



ip6_input()
{
    //calls ip6_input_finish()
    NF_HOOK(ip6_input_finish());
}

ip6_input_finish()
{
    //1. lookup protocol by checking with the ip header(TCP or ICMP or UDP...etc)
    //2. returns a "struct inet_protocol"
    //3. exec inet_protocol.handler()
    //
    //example:
    //returns struct inet6_protocol tcpv6_protocol{
    //    .handler = tcp_v6_rcv()
    //    }
    //
   
    protocol.handler()  // according to example, calls tcp_v6_rcv()
}

tcp_v6_rcv()
{
    __inet6_lookup_skb() // lookup, match or create socket
        tcp_v6_do_rcv();
}

tcp_v6_do_rcv()
{
    tcp_rcv_established() -->skb_copy_datagram_iovec()-->memcpy_toiovec()-->copy_to_user()
}








others:

addrconf_dst_alloc() //create dst_entry

沒有留言: