Commit Graph

24 Commits

Author SHA1 Message Date
Nate Brown 78d0d46bae
Remove WriteRaw, cidrTree -> routeTree to better describe its purpose, remove redundancy from field names (#582) 2021-11-12 12:47:09 -06:00
Nate Brown 88ce0edf76
Start the overlay package with the old Inside interface (#576) 2021-11-10 21:52:26 -06:00
CzBiX 16be0ce566
Add Wintun support (#289) 2021-11-08 12:36:31 -06:00
Nate Brown bcabcfdaca
Rework some things into packages (#489) 2021-11-03 20:54:04 -05:00
brad-defined 6ae8ba26f7
Add a context object in nebula.Main to clean up on error (#550) 2021-11-02 13:14:26 -05:00
Donatas Abraitis 32e2619323
Teardown tunnel automatically if peer's certificate expired (#370) 2021-10-20 13:23:33 -05:00
Wade Simmons 44cb697552
Add more metrics (#450)
* Add more metrics

This change adds the following counter metrics:

Metrics to track packets dropped at the firewall:

    firewall.dropped.local_ip
    firewall.dropped.remote_ip
    firewall.dropped.no_rule

Metrics to track handshakes attempts that have been initiated and ones
that have timed out (ones that have completed are tracked by the
existing "handshakes" histogram).

    handshake_manager.initiated
    handshake_manager.timed_out

Metrics to track when cached_packets are dropped because we run out of
buffer space, and how many are sent once the handshake completes.

    hostinfo.cached_packets.dropped
    hostinfo.cached_packets.sent

This change also notes how many cached packets we have when we log the
final "Handshake received" message for either stage1 for stage2.

* separate incoming/outgoing metrics

* remove "allowed" firewall metrics

We don't need this on the hotpath, they aren't worh it.

* don't need pointers here
2021-04-27 22:23:18 -04:00
brad-defined 17106f83a0
Ensure the Nebula device exists before attempting to bind to the Nebula IP (#375) 2021-04-16 10:34:28 -05:00
Nathan Brown 64d8e5aa96
More LH cleanup (#429) 2021-04-01 10:23:31 -05:00
Nathan Brown 883e09a392
Don't use a global ca pool (#426) 2021-03-29 12:10:19 -05:00
Nathan Brown 3ea7e1b75f
Don't use a global logger (#423) 2021-03-26 09:46:30 -05:00
Wade Simmons d604270966
Fix most known data races (#396)
This change fixes all of the known data races that `make smoke-docker-race` finds, except for one.

Most of these races are around the handshake phase for a hostinfo, so we add a RWLock to the hostinfo and Lock during each of the handshake stages.

Some of the other races are around consistently using `atomic` around the `messageCounter` field. To make this harder to mess up, I have renamed the field to `atomicMessageCounter` (I also removed the unnecessary extra pointer deference as we can just point directly to the struct field).

The last remaining data race is around reading `ConnectionInfo.ready`, which is a boolean that is only written to once when the handshake has finished. Due to it being in the hot path for packets and the rare case that this could actually be an issue, holding off on fixing that one for now.

here is the results of `make smoke-docker-race`:

before:

    lighthouse1: Found 2 data race(s)
    host2:       Found 36 data race(s)
    host3:       Found 17 data race(s)
    host4:       Found 31 data race(s)

after:

    host2: Found 1 data race(s)
    host4: Found 1 data race(s)

Fixes: #147
Fixes: #226
Fixes: #283
Fixes: #316
2021-03-05 21:18:33 -05:00
Nathan Brown b6234abfb3
Add a way to trigger punch backs via lighthouse (#394) 2021-03-01 19:06:01 -06:00
Wade Simmons 2a4beb41b9
Routine-local conntrack cache (#391)
Previously, every packet we see gets a lock on the conntrack table and updates it. When running with multiple routines, this can cause heavy lock contention and limit our ability for the threads to run independently. This change caches reads from the conntrack table for a very short period of time to reduce this lock contention. This cache will currently default to disabled unless you are running with multiple routines, in which case the default cache delay will be 1 second. This means that entries in the conntrack table may be up to 1 second out of date and remain in a routine local cache for up to 1 second longer than the global table.

Instead of calling time.Now() for every packet, this cache system relies on a tick thread that updates the current cache "version" each tick. Every packet we check if the cache version is out of date, and reset the cache if so.
2021-03-01 19:52:17 -05:00
Wade Simmons d232ccbfab
add metrics for the udp sockets using SO_MEMINFO (#390)
Retrieve the current socket stats using SO_MEMINFO and report them as
metrics gauges. If SO_MEMINFO isn't supported, we don't report these metrics.
2021-03-01 19:51:33 -05:00
Nathan Brown ecfb40f29c
Fix osx for mq changes, this does not implement mq on osx (#395) 2021-03-01 16:57:05 -05:00
Wade Simmons 27d9a67dda
Proper multiqueue support for tun devices (#382)
This change is for Linux only.

Previously, when running with multiple tun.routines, we would only have one file descriptor. This change instead sets IFF_MULTI_QUEUE and opens a file descriptor for each routine. This allows us to process with multiple threads while preventing out of order packet reception issues.

To attempt to distribute the flows across the queues, we try to write to the tun/UDP queue that corresponds with the one we read from. So if we read a packet from tun queue "2", we will write the outgoing encrypted packet to UDP queue "2". Because of the nature of how multi queue works with flows, a given host tunnel will be sticky to a given routine (so if you try to performance benchmark by only using one tunnel between two hosts, you are only going to be using a max of one thread for each direction).

Because this system works much better when we can correlate flows between the tun and udp routines, we are deprecating the undocumented "tun.routines" and "listen.routines" parameters and introducing a new "routines" parameter that sets the value for both. If you use the old undocumented parameters, the max of the values will be used and a warning logged.

Co-authored-by: Nate Brown <nbrown.us@gmail.com>
2021-02-25 15:01:14 -05:00
Nathan Brown 68e3e84fdc
More like a library (#279) 2020-09-18 09:20:09 -05:00
Wade Simmons f3a6d8d990
Preserve conntrack table during firewall rules reload (SIGHUP) (#233)
Currently, we drop the conntrack table when firewall rules change during a SIGHUP reload. This means responses to inflight HTTP requests can be dropped, among other issues. This change copies the conntrack table over to the new firewall (it holds the conntrack mutex lock during this process, to be safe).

This change also records which firewall rules hash each conntrack entry used, so that we can re-verify the rules after the new firewall has been loaded.
2020-07-31 18:53:36 -04:00
forfuncsake 9b06748506
Make Interface.Inside an interface type (#252)
This commit updates the Interface.Inside type to be a new interface
type instead of a *Tun. This will allow for an inside interface
that does not use a tun device, such as a single-binary client that
can run without elevated privileges.
2020-07-28 08:53:16 -04:00
Wade Simmons b37a91cfbc
add meta packet statistics (#230)
This change add more metrics around "meta" (non "message" type packets).
For lighthouse packets, we also record statistics around the specific
lighthouse meta type.

We don't keep statistics for the "message" type so that we don't slow
down the fast path (and you can just look at metrics on the tun
interface to find that information).
2020-06-26 13:45:48 -04:00
Nathan Brown 45a5de2719
Print the udp listen address on startup (#181) 2020-02-06 21:17:43 -08:00
Ryan Huber 6a460ba38b remove old hmac function. superceded by ix_psk0 2019-11-23 16:50:36 +00:00
Slack Security Team f22b4b584d Public Release 2019-11-19 17:00:20 +00:00