DragonFly BSD

IPFW3 Documentation

Bill Yuan

12 May, 2018


Introduction

IPFW is a stateful firewall originally written for FreeBSD. It is comprised of several components, e.g. the kernel firewall filter rule processor and its integrated packet accounting facility, the logging facility, NAT, the dummynet(4) traffic shaper, a forward facility, a bridge facility, and an ipstealth facility. It is one of the most advanced opensource firewalls.

DragonFly BSD is a logical continuation of the FreeBSD 4.x series, but has diverged significantly from FreeBSD since the fork in 2004, e.g., DragonFly BSD has implemented a new Light Weight Kernel Threads implementation (LWKT) and a lightweight ports/messaging system.

In DragonFly, each CPU has its own thread scheduler. Upon creation, threads are assigned to processors and are never preemptively switched from one processor to another; they are only migrated by the passing of an inter-processor interrupt (IPI) message between the CPUs involved. Inter-processor thread scheduling is also accomplished by sending asynchronous IPI messages. One advantage to this clean compartmentalization of the threading subsystem is that the processors' on-board caches in Symmetric Multiprocessor Systems do not contain duplicated data, allowing for higher performance by giving each processor in the system the ability to use its own cache to store different things to work on.

The LWKT subsystem is being employed to partition work among multiple kernel threads (for example in the networking code there is one thread per protocol per processor), reducing competition by removing the need to share certain resources among various kernel tasks.

This IPFW3 is a rewritten from scratch of the FreeBSD's IPFW for DragonFly BSD. It is in modular design, and inherited the SMP-friendly feature from DragonFly BSD's LWKT, therefore IPFW3 is a lockless and stateful firewall.

Brief notes on design

IPFW3 is in modular design, with different functionalities implemented in different loadable modules, which can be loaded on requirements. The IPFW3 core module only comes with the allow and deny actions, which may be loaded manually by:

kldload ipfw3

More basic firewall filtering features, e.g., filtering based on source IP, are implemented in the basic module, which may be loaded by:

kldload ipfw3_basic

Besides the core and basic modules, there are also the layer2, layer4, in-kernel NAT, and dummynet modules.

Note that the corresponding kernel module must be loaded, otherwise, ipfw3(8) would complain errors of unknown/bad command.

Each module contains 2 parts. library in user-space which will be loaded and parse the command line into rules. and kernel space portion will be invoked when the traffic hit the firewall and it will trigger the correct action according to the firewall rules.

Compare to FreeBSD's IPFW

IPFW3 not only inherited most features from FreeBSD's IPFW, but also introduced lots of new features from OpenBSD's PF and other rivals.

Much more extensible

Every filter/action needs to be identified using ID, but there are only 8 bits space to store the ID, so theoretically it can support 256 filters/actions in maximum in FreeBSD' IPFW. While in IPFW3, the space for ID are still the same, but one space introduced to keep the module's ID, so theoretically ipfw3 can have 256 modules and 256 filter/action in each module.

And in IPFW3, both user-space library and kernel space module are implemented with a simple interface, it is quite easy to build your own filter/module by following the interface.

Much more concise

The rules of IPFW3 are much more concise. For example, a simple rule of IPFW looks like:

ipfw add allow ip from any to any

where the from any to any is actually just for more readable. While IPFW3 supports the same syntax as FreeBSD's IPFW, we recommend to just use the simplified version, e.g.:

#1. ipfw3 add allow all 
#2. ipfw3 add allow icmp from 1.1.1.1
#3. ipfw3 add allow tcp via em0

Higher Performance

All modern CPUs are having mutil-cores, and each core are running independently. So the LWKT of DragonFly BSD is the best way to fully utilize the CPU power. by duplicating the environment for each CPU, all the CPU can run as fast as it can without any interference. So it is a lockless and stateful firewall.

Basic Configuration

Core Framework

Below actions are directly supported from the core framework.

accept -- accept the traffic

deny -- deny the traffic

the default action of the firewall was compiled in the core framework. but it still can be interfered by below system tunable when the module was loaded into the kernel, e.g.,

sysctl net.filters_default_to_accept=1

Basic Module

Below filter/actions are supported in basic module.

proto -- matches traffic protocol, it is implicit after the action.

from -- matches the source.

Filter from supports multiple type of parameters:

from 8.8.8.8 -- match traffic from IP 8.8.8.8
from table 1    -- match traffic where source IP found in table 1
from any  -- not filtering
from me  -- match traffic from the host
from 192.168.1.0/24 -- match traffic from the IP range

to -- matches the destination, which supports the same parameters as filter from

count -- action count the traffic

skipto -- skipto another line in the rules

forward -- forward the current traffic to a destination

in -- matches the in direction traffic

out -- matches the out direction traffic

via -- matches the traffic go through an interface

xmit -- matches the out direction traffic through an interface

recv -- matches the in direction traffic through an interface

src-port --matches the src port of TCP/UDP traffic

dst-port --matches the dst port of TCP/UDP traffic

prob -- randomly match the traffic

keep-state -- setup a state in current CPU only

check-state -- check the traffic state against the state tables of current CPU

tag -- add a tag the traffic

untag -- remove the tag from the traffic

tagged -- matched the traffic with the tag

// -- append some comment at the end of the rule

Layer2 Module

layer2 -- matches layer2 traffic

mac -- matches layer2 traffic with src and dst MAC addresses

mac-from -- matches layer2 traffic with src MAC address (supports lookup table)

mac-to -- matches layer2 traffic with dst MAC address (supports lookup table)

Layer4 Module

tcpflag -- matches the TCP flag

uid -- matches the sockets owner ID

gid -- matches the sockets owner's group ID

established --matched the established TCP connection

bpf -- filter traffic with bpf syntax

NAT Module

nat -- NAT traffic with pre-defined NAT configuration

Dummynet3 Module

pipe -- pipe traffic with pre-defined pipe configuration

queue -- queue traffic with pre-defined queue configuration

Advanced Configuration

Rule Set

Each rule belongs to one of 32 different sets , numbered 0 to 31. Set 31 is reserved for the default rule.

By default, rules are put in set 0, unless you use the set N attribute when entering a new rule. Sets can be individually and atomically enabled or disabled, so this mechanism permits an easy way to store multiple configurations of the firewall and quickly (and atomically) switch between them. The command to enable/disable sets is

ipfw3 set [disable number ...] [enable number ...]

where multiple enable or disable sections can be specified. Command execution is atomic on all the sets specified in the command. By default, all sets are enabled.

Lookup Table

In the following example, We need to create several rules in order to block ICMP traffic from whole range of the network 192.168.0.0/24.

ipfw3 add deny icmp from 192.168.0.1
ipfw3 add deny icmp from 192.168.0.2
ipfw3 add deny icmp from 192.168.0.3
...
ipfw3 add deny icmp from 192.168.0.254

The firewall need to process the 254 lines of rule line by line. in this situation, the lookup table was introduced in order to increase the performance and enhance the usability.

ipfw3 table 1 type ip  # create a table of id=1 and type=ip
ipfw3 table 1 add ip 192.168.0.0/24
ipfw3 add deny icmp from table 1

and lookup table are supported by multiple filters.

Forwarding

The forward action will change the next-hop on matching packets to ipaddr, which can be an IP address in dotted quad format or a host name.The search terminates if this rule matches.

If ipaddr it can be is a local addresses, then matching packets will be forwarded to port (or the port number in the packet if one is not specified in the rule) on the local machine. If ipaddr is not a local address, then the port number (if specified) is ignored, and the packet will be forwarded to the remote address, using the route as found in the local routing table for that IP. Use commas to separate multiple ip addresses.

forward action supports multiple options, it can be round-robin' orsticky'. `sticky' is calculated based on the src ip addresses, and if no forward-option, by default it will be 'random'.

ipfw3 add forward 192.168.1.1:80,192.168.1.2:80 round-robin tcp from ....

Above example can forward the traffic to 2 destination in round-robin.

A forward rule will not match layer-2 packets (those received on ether_input() or ether_output()). The forward action does not change the contents of the packet at all. In particular, the destination address remains unmodified, so packets forwarded to another system will usually be rejected by that system unless there is a matching rule on that system to capture them. For packets forwarded locally, the local address of the socket will be set to the original destination address of the packet. This makes the netstat(1) entry look rather weird but is intended for use with transparent proxy servers.

BPF Filtering

Berkeley Packet Filter (BPF)'s filtering capabilities are implemented as an interpreter for a machine language for the BPF virtual machine; programs in that language can fetch data from the packet, perform arithmetic operations on data from the packet, and compare the results against constants or against data in the packet or test bits in the results, accepting or rejecting the packet based on the results of those tests. The original paper was written by Steven McCanne and Van Jacobson in 1992 while at Lawrence Berkeley Laboratory.

The ipfw3 firewall integrates with a BPF filter, a just-in-time compilation is used to convert virtual machine instructions into native code, and it will be invoked when the traffic hits a rule with BPF filter.

ipfw3 add allow all bpf "icmp and src 8.8.8.8"

the bpf filter can be used to filter all parts of the packet, includes the payload.

Stateful

Stateful operation is a way for the firewall to dynamically create states for specific flows when packets that match a given pattern are detected. Support for stateful operation comes through the check-state, keep-state of rules.

States are created when a packet matches a keep-state rule, causing the creation of a dynamic rule which will match all and only packets with a given protocol between a src-ip/src-port dst-ip/dst-port pair of addresses (src and dst are used here only to denote the initial match addresses, but they are completely equivalent afterwards). Dynamic rules will be checked at the first check-state, keep-state or limit occurrence, and the action performed upon a match will be the same as in the parent rule.

Note that no additional attributes other than protocol and IP addresses and ports are checked on dynamic rules.

The typical use of dynamic rules is to keep a closed firewall configuration, but let the first TCP SYN packet from the inside network install a state for the flow so that packets belonging to that session will be allowed through the firewall:

ipfw3 add check-state
ipfw3 add allow tcp from my-subnet to any keep-state
ipfw3 add deny tcp from any to any

A similar approach can be used for UDP, where an UDP packet coming from the inside will install a dynamic rule to let the response through the firewall:

ipfw3 add check-state
ipfw3 add allow udp from my-subnet to any keep-state
ipfw3 add deny udp from any to any

States expire after some time, which depends on the status of the flow and the setting of some sysctl variables. See Section SYSCTL VARIABLES for more details. For TCP sessions, dynamic rules can be instructed to periodically send keepalive packets to refresh the state of the rule when it is about to expire.

States can be added/deleted using the ipfw3 utility. and the inserted states will be hosted on the current CPU only. once the live time expired, the state will be purged by the housekeeping process and will not be able to influence the traffic.

ipfw3 state add rule 1000 udp 192.168.1.100:0 8.8.8.8:53 expiry 600

Above example can immediately insert a state which will link to rule 1000 for the UDP traffic which is from 192.168.1.100 to 8.8.8.8:53, and the state will be expired in 600 seconds if no traffic update the states.

In-Kernel NAT

Ipfw3 supports in-kernel NAT using the kernel version of libalias which is a moderately portable set of functions designed to assist in the process of IP masquerading and network address translation. Out-going packets from a local network with unregistered IP addresses can be aliased to appear as if they came from an accessible IP address. Incoming packets are then de-aliased so that they are sent to the correct machine on the local network.

A certain amount of flexibility is built into the packet aliasing engine. In the simplest mode of operation, a many-to-one address mapping takes place between the local network and the packet aliasing host. This is known as IP masquerading. In addition, one-to-one mappings between local and public addresses can also be implemented, which is known as static NAT. In between these extremes, different groups of private addresses can be linked to different public addresses, comprising several distinct many-to-one mappings. Also, a given public address and port can be statically redirected to a private address/port.

In ipfw3, each CPU has its own context, and by storing the alias link records into different context according to the CPU, the lock can be removed in order to achieve the best performance. and due to the nature of NAT, the outgoing and returning packets are possible to be handled by different CPUs, to ensure the return traffic and be translated back to the correct address, the newly created alias_link are required to be duplicated and inserted into contexts of both CPUs.

So this in-kernel NAT in DragonFly BSD is the only true lockless in-kernel NAT amount all common opensource operating systems.

IPFW3sync

IPFW3sync is the facility in IPFW3 which can synchronize firewall states between machines running ipfw3 firewall for high availability. It can be used together with CARP to make ensure a backup firewall has the same states as the main firewall. When the master machine in the firewall cluster dies, the slave machine will be able to takeover the services and accept current connections without loss.

In order to use this synchronization feature, the firewalls need to be configured into IPFW3sync centre and/or edge nodes, therefore, the centre firewall will continuously sync its states to the edges using UDP protocol.

Use below commands to configure an IPFW3sync edge node, which will listen on the UDP port 5000:

ipfw3 sync edge 5000
ipfw3 sync start edge

Then configure an IPFW3sync centre node. Here the centre firewall will automatically sync its states to edge nodes 192.168.1.1:5000 and 192.168.1.2:5001 :

ipfw3 sync centre 192.168.1.1:5000,192.168.1.2:5001
ipfw3 sync start centre

This command to verify whether the IPFW3sync centre is able to send the test message to all the configured edge nodes:

ipfw3 sync test centre 1

Additional Topics

Logging

IPFW3 supports up to 10 ipfw pseudo interfaces for logging, which can be created with:

ifconfig ipfw1 create

Traffic matching a rule with the log action will be captured by the BPF and duplicated into the corresponding ipfw pseudo interface. e.g.

ipfw3 add 100 allow log 1 icmp from 8.8.8.8

This rule attaches the ICMP packets from 8.8.8.8 to the ipfw1 pseudo interface for logging.

Example Rules

Collection of the IPFW3 samples/articles.