Present Location: News >> Blog >> BGP w/Quagga and ScreenOS

Blog

> BGP w/Quagga and ScreenOS
Posted by prox, from Charlotte, on January 27, 2008 at 19:31 local (server) time

I decided to get my Juniper SSG 5 and Quagga to talk IBGP (NOT iBGP; it's not an Apple product) today.  As expected, I ran into a couple problems, but eventually succeeded.

The players:

The configuration:

    Juniper SSG 5                     Quagga
/--------------------\          /----------------\
|    10.3.4.21/32    |          |   10.3.4.5/32  |
|     loopback.1     |          |      lo:0      |
|         |          |          |       |        |
\---------|----------/          \-------|--------/
    10.3.253.2/29                  10.3.253.6/29
     ethernet0/0                       eth1
          |                             |
          \-----------[ switch ]--------/

Both systems run OSPF as the IGP, and reside in the backbone area.  Routing between loopback interfaces is in place (both can be PINGed using the correct source interfaces).

The autonomous system number is 65301, a member-AS belonging to a confederation with ID 65003.  The entire BGP topology is out of scope for this, so let's just keep it at that, and accept this is plain-old IBGP, just with some confederation statements.  The SSG 5 is a route reflector client to Quagga, too.

On the SSG 5, both the loopback.1 and ethernet0/0 interfaces reside in the Trust zone, which is a member of the trust-vr.  Here's the BGP configuration on the SSG 5:

e(trust-vr/bgp)-> get config
set protocol bgp 65301
set confederation id 65003
set enable
unset synchronization
set neighbor 10.3.4.5 remote-as 65301 src-interface loopback.1
set neighbor 10.3.4.5 enable
set confederation peer 65300
set confederation peer 65302
set confederation peer 65303
set confederation peer 65304
exit
set interface ethernet0/0 protocol bgp

Here's the relevant neighbor configuration from Quagga's bgpd:

router bgp 65301
 bgp router-id 10.3.4.5
 bgp confederation identifier 65003
 bgp confederation peers 65300 65302 65303 65304
 bgp bestpath as-path confed
 neighbor IBGP peer-group
 neighbor IBGP remote-as 65301
 neighbor IBGP update-source 10.3.4.5
 neighbor IBGP route-reflector-client
 neighbor IBGP next-hop-self
 neighbor IBGP soft-reconfiguration inbound
 neighbor 10.3.4.21 peer-group IBGP
 neighbor 10.3.4.21 description Connection to e.prolixium.com.

Fairly self-explanatory.  I use a peer-group configuration here, since there are some other IBGP neighbors configured.

The test:

With the configuration above, it didn't work, simple as that.  Some tcpdumps and debugs on both sides gave me a little more insight into what was happening, but not why it was.

Here's a piece of the packet capture (slighty edited to fit on the screen):

15:50:26.172512 IP 10.3.4.5.39533 > 10.3.4.21.179: S 2248654264:2248654264(0) win 5840
15:50:32.172469 IP 10.3.4.5.39533 > 10.3.4.21.179: S 2248654264:2248654264(0) win 5840
15:50:44.172496 IP 10.3.4.5.39533 > 10.3.4.21.179: S 2248654264:2248654264(0) win 5840
15:50:46.655847 IP 10.3.4.21.1065 > 10.3.4.5.179: S 3296298904:3296298904(0) win 8192
15:50:46.656517 IP 10.3.4.5.179 > 10.3.4.21.1065: S 2267892974:2267892974(0) ack 3296298905 win 5840
15:50:46.659102 IP 10.3.4.21.1065 > 10.3.4.5.179: . ack 1 win 8192
15:50:47.190300 IP 10.3.4.21.1065 > 10.3.4.5.179: P 1:42(41) ack 1 win 8192: BGP, length: 41
        Open Message (1), length: 41
          Version 4, my AS 65301, Holdtime 180s, ID 10.3.4.21
          Optional parameters, length: 12
            Option Capabilities Advertisement (2), length: 10
              Multiprotocol Extensions (1), length: 4
                AFI IPv4 (1), SAFI Unicast (1)
15:50:47.190620 IP 10.3.4.5.179 > 10.3.4.21.1065: . ack 42 win 5840
15:50:47.193043 IP 10.3.4.5.179 > 10.3.4.21.1065: P 1:22(21) ack 42 win 5840: BGP, length: 21
        Notification Message (3), length: 21, Cease (6), subcode Connection Rejected (5)
15:50:47.193785 IP 10.3.4.5.179 > 10.3.4.21.1065: F 22:22(0) ack 42 win 5840
15:50:47.197039 IP 10.3.4.21.1065 > 10.3.4.5.179: FP 42:63(21) ack 22 win 8192: BGP, length: 21
        Notification Message (3), length: 21, Finite State Machine Error (5)
15:50:47.197450 IP 10.3.4.5.179 > 10.3.4.21.1065: R 2267892996:2267892996(0) win 0
15:50:47.197521 IP 10.3.4.21.1065 > 10.3.4.5.179: FP 63:63(0) ack 22 win 8192
15:50:47.197710 IP 10.3.4.5.179 > 10.3.4.21.1065: R 2267892996:2267892996(0) win 0
15:50:47.197759 IP 10.3.4.21.1065 > 10.3.4.5.179: FP 63:63(0) ack 23 win 8192
15:50:47.197899 IP 10.3.4.5.179 > 10.3.4.21.1065: R 2267892997:2267892997(0) win 0

There are a couple problems, there.

First, it doesn't look like Quagga can connect to the loopback.1 interface on the SSG 5.  Enabling the bgp protocol on the loopback.1 interface didn't help.  Anyway, according to BGP rules in RFC4271 (section 6.8), if both BGP connections were established in parallel, the one initiated with the lower router ID (Quagga, in this case) will be torn down.  So, who cares about that connection, I guess (it's still not acting right, but I can live with that).

Secondly, Quagga is immediately rejecting the connection from the SSG 5.  Why?  Who knows.  the debug on Quagga's bgpd sheds absolutely no light on the situation:

2008/01/27 06:26:47 BGP: [Event] BGP connection from host 10.3.4.21
2008/01/27 06:26:47 BGP: [Event] Make dummy peer structure until read Open packet
2008/01/27 06:26:47 BGP: 10.3.4.21 [FSM] TCP_connection_open (Active->OpenSent)
2008/01/27 06:26:47 BGP: 10.3.4.21 passive open
2008/01/27 06:26:47 BGP: 10.3.4.21 went from Active to OpenSent
2008/01/27 06:26:47 BGP: 10.3.4.21 rcv message type 1, length (excl. header) 22
2008/01/27 06:26:47 BGP: 10.3.4.21 rcv OPEN, version 4, remote-as 65301, holdtime 180, id 10.3.4.21
2008/01/27 06:26:47 BGP: 10.3.4.21 peer status is Connect close connection
2008/01/27 06:26:47 BGP: 10.3.4.21 sending NOTIFICATION 6/5 (Cease/Connection Rejected) 0 bytes
2008/01/27 06:26:47 BGP: 10.3.4.21 send message type 3, length (incl. header) 21
2008/01/27 06:26:47 BGP: 10.3.4.21 [Event] Accepting BGP peer delete
2008/01/27 06:26:47 BGP: 10.3.4.21 went from OpenSent to Deleted

That basically tells me that Quagga's bgpd said "I just received an OPEN message, but I guess I didn't like it" or something.  Absolutely no information indicating WHY the peer was dumped.

Anyway, back to the tcpdump output, the third (relatively minor) thing I see wrong is that the SSG 5 threw a FSM error after it received the cease (6) message, which instructs the peer to tear down the connection immediately.  Both systems then exchange a couple odd-looking TCP RSTs, and give up.  The FSM error is clearly a problem with the BGP implementation on ScreenOS.  If you receive a cease (6), don't bother sending other messages to the peer!

Since that didn't help too much, here's a little more detail on the OPEN message from the SSG 5:

OPEN Message
    Marker: 16 bytes
    Length: 41 bytes
    Type: OPEN Message (1)
    Version: 4
    My AS: 65301
    Hold time: 180
    BGP identifier: 10.3.4.21
    Optional parameters length: 12 bytes
    Optional parameters
        Capabilities Advertisement (12 bytes)
            Parameter type: Capabilities (2)
            Parameter length: 10 bytes
            Multiprotocol extensions capability (6 bytes)
                Capability code: Multiprotocol extensions capability (1)
                Capability length: 4 bytes
                Capability value
                    Address family identifier: IPv4 (1)
                    Reserved: 1 byte
                    Subsequent address family identifier: Unicast (1)
            Route refresh capability (2 bytes)
                Capability code: Route refresh capability (2)
                Capability length: 0 bytes
            Route refresh capability (2 bytes)
                Capability code: Route refresh capability (128)
                Capability length: 0 bytes

Nothing out of the ordinary, there.  I compared this to the other OPEN messages on some of my other connections, and it's identical.  JUNOS advertises graceful restart (64), but it's not negotiated by Quagga.

I then started changing random knobs in the Quagga configuration, and found one that worked: setting the peer to passive only.  This tells bgpd to never initiate connections to the other neighbor, only accept them.  Bingo, the next connection from the SSG 5 was accepted, each system sent an OPEN message, and the RIB on the SSG 5 became populated.

Conclusions:

Here's the one-liner that made it possible:

 neighbor 10.3.4.21 passive

I still think BOTH implementations are somewhat broken.  Quagga, definitely, since setting the passive option shouldn't make a difference whether it accepts a connection, or not.  ScreenOS, because it won't accept any connections.

It's possible I may need a policy from Trust to Trust, permitting TCP port 179, but that too seems like a bad workaround.

Well, there it is.  Probably more information than you wanted.

Comment by artom on June 12, 2010 at 11:45 local (server) time

Great !
many thanks!!!
IT`s helps me very much


> Add Comment

New comments are currently disabled for this entry.