BGP Explained

1
Border Gateway
Protocol
Anl ALBEYOLU
Network Consultant
CCIE #24974
Agenda
General Information,
Theory,
Explains every component in a very detailed fashion.
To understand theory, basic data communication + clear
routing knowledge is a pre-req.

Implementation,
Cisco Systems IOS 12.4(25d) Advanced Enterprise code
is used during the whole implementation.
GNS3 is used for the lab. and Wireshark is used as a
packet sniffer.
General Information (1)

Used for routing between "Autonomous Systems".
Classified as a "Path Vector" routing protocol.
Uses "Attributes" to manipulate inside/outside traffic
flow.
Reliable since it uses TCP as a transport protocol.
Scalable, hierarchical and loop-free.
Secure.
Open standard (RFC 4271).
Internet relies on BGP.
ISPs and enterprise customers can run BGP.
General Information (2)

Is not appropriate when:
Single connection to ISP,
Policies arent used,
Enough memory & CPU arent available,
Technical staff arent qualified enough to operate &
troubleshoot it,
Is appropriate when:
Multiple connections to ISPs,
Policies are used,
Enough memory & CPU are available,
Technical staff are qualified enough to operate &
troubleshoot it.
Theory General Concepts (1)

BGP can be assumed just a regular TCP application
that uses TCP port 179. So, to open a TCP connection;

proper route for the destination IP address (BGP peer in
other words) must exist in the routing table. Then
prefixes can be exchanged via established TCP
connection.
BGP neighbors cant be discovered, they must be
defined manually.
Only one TCP session is maintained even if both ends
attempt for succesful TCP connection.
All BGP messages are sent as unicast.

BGP has 2 types of neighborship:
iBGP (internal-BGP; BGP neighbors are in the same AS)

eBGP (external-BGP; BGP neighbors are in different ASs)
iBGP administrative distance is 200, eBGP administrative
distance is 20.
BGP supports summarization and CIDR.
Uses incremented/triggered updates.
Only installs "best path" into the routing table and only
announces "best path" to other BGP peers (prefix will be
advertised must exist exactly in the routing table).
Can use MD5 authentication between peerings.

BGP maintains 2 table:
Neighbor table (contains BGP timers and related information, prefix related information,
BGP messages sent/received to/from neighbors, address-family and denied prefixes

informationetc. In other words, contains all neighbor related information)
BGP table (contains all learned BGP prefixes, their attributes, best/all paths)
Only best paths are put in to the routing table (if multi-path load sharing is
enabled, more than one path can be put into the routing table up to 16 multiple
path).
A BGP router with synchronization enabled does not install iBGP learned routes
into its routing table if it is not able to validate those routes in its IGP. (disable
sync. for best practice, mostly its not used)
If sync. is enabled and IGP is OSPF, neighbors OSPF & BGP router IDs must be
same.
Either disable sync. & run BGP in all routers inside the AS or keep it & redistribute
prefixes from BGP to IGP. (or Tunnel BGP over GRE, IPIPetc.)

BGP has a Split Horizon rule which means; a prefix learned from an iBGP
neighbor can not be advertised to an other iBGP peer.

When a router receives an UPDATE message that contains its own AS number in
AS_PATH attribute, it ignores it (this is known as AS-Path loop prevention
mechanism). Because when an UPDATE message leaves an AS, the AS number
is prepended and then UPDATE message is sent with this new AS_PATH
information.

Inside the AS, all routers must be full-meshed iBGP peer (because of
BGP Split Horizon Rule). As you guess, iBGP peers dont modify the
NEXT_HOP attribute in UPDATE messages between each other.
This means when you have N routers in your AS, you can clearly see that
you should have (N x [N-1]) / 2 iBGP sessions in your BGP domain.
Of course its not scalable for large-scale deployments. Also number of
TCP sessions become extra overhead for the routers and multiple
duplicate routing traffic traverses all around the network. For this, there
are 2 options to solve this problem:
Route reflectors can be used (one or more routers are assigned as a
"reflector", these routers advertise routing information to clients to non-clients

in some cases)
Confederations can be used (main AS is divided into sub-ASs, all rules
remain same; each sub-AS establishes eBGP session between each
otheretc.).
10

A router can belong to only one AS.
BGP AS numbers can be between 0 to 65535.
0, 5939264511 and 65535 are reserved by IANA.
64512-65534 can be used as a private AS (e.g. in
confederation deployments, as a sub-AS).

Remaining part can be used as a public AS.
In eBGP peerings, TTL value is 1. NEXT_HOP attribute
is modified between eBGP peers.
In iBGP peerings, TTL value is 255. NEXT_HOP
attribute isnt modified between iBGP peers.
11
Theory Messages
BGP has 4 message types:
OPEN
KEEPALIVE
UPDATE
NOTIFICATION
Additionally ROUTE REFRESH message

(type 5) can be used if its aggreed up on the
peering.
12
Theory OPEN Message

After a neighbor has been configured and TCP session
has been established, firstly an OPEN message (type 1)

is sent to neighbor to form a BGP peering.
This message contains several information such as
version of BGP (currently its 4), AS number, hold time
(timers are negotitated between peers) value, BGP
router-ID. Also it contains some optional parameters like
capabilities. Capabilities contain which address-familyidentifier (AFI) and sub-AFI (SAFI) can be used, also
some features such as Route Refresh.
Next slide you can see the capture file of an OPEN
message.
13
Theory OPEN Message Structure
14
Theory KEEPALIVE Message

After BGP peering has been established, periodical
KEEPALIVE messages (type 4) are sent every 60 seconds

by default.
Its just a simple message that doesnt contain too much
information, it just ensures that BGP peering is UP and
working without any problem.
If KEEPALIVE messages are not received by a neighbor in
a time frame defined in hold time value in OPEN message,
then this neighbor assumes that other side is no more a
BGP peer and it finishes the BGP session by sending a
NOTIFICATION message (well see later) and also it finishes
the TCP session by sending TCP FIN packet.
15
Theory KEEPALIVE Message Structure
16
Theory UPDATE Message (1)

As we mentioned in previous slides, firstly TCP connection has been
established, secondly BGP peering has been established with OPEN

messages, now its time to exchange prefix information between
these peers. As you guess, UPDATE messages (type 2) are used to
exchange prefix information.
UPDATE messages contain so much information about related
prefix/prefixes and their attributes. NLRI term is used instead of prefix
in BGP world, it stands for Network Layer Reachability Information.
When the NLRI becomes unreachable somehow, UPDATE message
carries this information as "withdrawn routes".
One UPDATE message can contain multiple NLRI information with
their attributes. And also with one TCP segment, multiple BGP
messages can be transported (see next slide).
17
Theory UPDATE Message (2)
As you see above, theres a succesful TCP connection
and after that theres a successful BGP peering

connection. Next, UPDATE messages are exchanged
between two BGP peers.
18
Theory UPDATE Message Structure

UPDATE message format with withdrawn routes:
UPDATE message format with new routes:
19
Theory NOTIFICATION Message

NOTIFICATION message (type 3) is sent when there is a
problem. This message closes the BGP connection.

There may be many reasons to send the NOTIFICATION
message (wrong neighbor AS number configuration, hold
timer expirationetc.).
Reason is put in the NOTIFICATION message, so you
can troubleshoot it easily.
20
Theory NOTIFICATION Message

Structure
Below you can see that hold timer expired here:
Also in the below example, wrong neighbor AS number
has been configured, which means other side doesnt

expect that AS number in the OPEN message:
21
Theory ROUTE REFRESH Message

ROUTE REFRESH message (type 5) is a special message that
informs BGP peer to exchange prefix information again which are

exchanged before. It doesnt contain too much information
(contains AFI/SAFI), its just for information for the BGP peer.
In early deployments of BGP, whenever you change the routing
policy, related BGP connection was reset. To avoid this, this open
standard feature is used (Route Refresh Capability RFC 2918).
When you change the routing policy, router sends this message
for impacted AFI/SAFI to its peer, then if the peer router
understands that message, it re-advertises the prefixes.
This capability is sent during the BGP peering (as you learn from
previous slides, therere "capabilities" in "Optional Parameter"
field, in the OPEN message).
22
Theory ROUTE REFRESH Message

Structure
Below you can see that routing policy is changed for
unicast IPv4 traffic:
As soon as its sent, UPDATE messages are exchanged
between peers:
23
Theory States
BGP neighbor states are:
IDLE
ACTIVE
CONNECT
OPEN SENT
OPEN CONFIRM
ESTABLISHED
24
Theory IDLE & CONNECT State

In IDLE state, router does not allocate any BGP
resources and during this time, router does not

accept any incoming BGP session.
In CONNECT state, BGP waits for a successful
TCP connection. If TCP connection is successful,
BGP FSM goes to OPENSENT since it immediately
sends an OPEN message to the peer after a
successful TCP connection. If TCP connection is
not completed, BGP FSM goes to ACTIVE,
CONNECT or IDLE state depending on the failure
reason.
25
Theory ACTIVE & OPENSENT State

In ACTIVE state, a TCP connection is initiated. If its
successful, BGP router sends an OPEN message

immediately and BGP FSM goes to OPENSENT state.
In the case of failure, BGP FSM goes to ACTIVE or
IDLE state .
In OPENSENT state, BGP router has already sent an
OPEN message and is waiting OPEN message from its
peer. If OPEN message is received succesfully from its
peer, BGP FSM goes to OPENCONFIRM state and a
KEEPALIVE message has been sent to its peer. In the
case of failure, BGP FSM goes to ACTIVE or IDLE
state.
26
Theory OPENCONFIRM &

ESTABLISHED State
In OPENCONFIRM state, BGP router has already received
OPEN message from its peer and is waiting a KEEPALIVE

message from its peer. If it receives a KEEPALIVE, BGP
FSM goes to ESTABLISHED state, otherwise BGP FSM
goes to IDLE state (as you guess, BGP FSM is one step
away from its final state).
In ESTABLISHED state, BGP router receives a
KEEPALIVE message from its peer. From this time, BGP
peers can exchange information between each other with
UPDATE messages (also KEEPALIVE messages are sent
periodically between each other, NOTIFICATION messages
can be sent in the case of failure).
27
Theory BGP Finite State Machine
28
Theory Attributes
BGP has several attributes:
Well-Known Mandatory: Must be supported and recognised by all BGP
routers. These attributes must be included in UPDATE messages. They must
be passed on to other BGP routers.
Well-Known Discretionary: Must be supported and recognised by all BGP
routers. They must be passed on to other BGP routers. But these attributes
may/may not be included in UPDATE messages, its not mandatory.
Optional Transitive: May be recognised/not recognised by BGP routers. But
they must be passed on to other BGP routers. If these type of attributes arent
recognised, theyre marked as "partial".
Optional Non-transitive: May be recognised/not recognised by BGP routers
and isnt passed on to other BGP routers.
Also some vendors may use additional attribute to manipulate best path
selection algoritm such as Cisco Systems, they use weight attribute which is
locally significant, higher is better.
29
Theory Well-Known Mandatory

Attributes (1)
NEXT_HOP: Holds IP address of the BGP router that
advertises the UPDATE message. Doesnt change when

UPDATE message is sent to an iBGP peer by default,
changes when UPDATE message is sent to an eBGP
peer.
AS_PATH: Holds an ordered list of AS numbers through
that UPDATE message has traversed. With this attribute,
incoming traffic to an AS will be manipulated (you can
prepend it).
ORIGIN: Holds the information that explains how this
NLRI has been learned (will be discussed in more detail in
"Implementation" section).
30
Theory Well-Known Mandatory

Attributes (2)
First packet shows that; this UPDATE message is originated from this router,
this NLRI has been learned from IGP (youll see later what it means) and to
reach this destination, next hop must be 10.0.12.1).
Second packet shows that; this UPDATE message is originated from AS
4570 and passed through the AS 60, this NLRI has been learned from IGP
(youll see later what it means) and to reach this destination, next hop must
be 10.0.16.6).
31
Theory Well-Known Discretionary

Attributes (1)
LOCAL_PREF: Holds the value that tells iBGP peers
which path they should select to reach a specific NLRI

which are outside the AS. In other words its a metric for
iBGP peers inside the AS to reach destinations that are
outside the AS (higher is better). With this attribute, traffic
leaving the AS can be manipulated. This attribute is
propagated through the local AS (will be discussed in
more detail in "Implementation" section).
ATOMIC_AGGREGATE: Informs the i/eBGP neighbor
that the originating router aggregated the routes.
32
Theory Well-Known Discretionary

Attributes (2)
This packet shows that; the LOCAL_PREF attribute value
is 100 for this/these NLRI(s) and NLRIs are aggregated

by a BGP router which originates more specific NLRIs.
33
Theory Optional Transitive Attributes (1)

AGGREGATOR: Holds the IP address and the AS
number of the BGP router that performed the

summarization/aggregation.
COMMUNITIES: Route tags that are used for
filtering/building specific policies/manupilating routing
process.
34
Theory Optional Transitive Attributes (2)

This packet shows that; BGP router with router-id 1.1.1.1
in AS 1230 has aggregated this NLRI and this packet has

a community attribute set to NO_ADVERTISE (will be
discussed in more detail in "Implementation" section).
35
Theory Optional Non-Transitive

Attributes (1)
MED (MULTI_EXIT_DISC): This attribute is a metric for eBGP
peers, according this attribute, neighbor AS will select the entrance

to our AS (lower is better).
CLUSTER_LIST: Holds the IP addresses of the Route Reflectors
that UPDATE message has been passed through. With this
information, loops are avoided (e.g. A route reflector ignores the
UPDATE messages that contain its BGP router-ID in
CLUSTER_LIST attribute, that means UPDATE message already
traversed its cluster). This attribute isnt used between RR & its
client.
ORIGINATOR_ID: Holds the IP address of the first announcer
(originator) of the NLRI in topologies that contain Route Reflectors
(youll see what it means in next slide). This attribute isnt used
between RR & its client.
36
Theory Optional Non-Transitive

Attributes (2)
First packet shows that; this UPDATE message is originated from 4.4.4.4,
then it passes through route-reflector (3.3.3.3) somehow, then this UPDATE

message enters an other RR cluster (RR is 1.1.1.1). MED is 0.
Second packet shows that; this UPDATE message is originated from 3.3.3.3
(which may or may not be a Route Reflector, we dont know), and an RR
cluster 1.1.1.1 has received that UPDATE message somehow. MED is again
0.
37
Theory Best Path Selection Algorithm
Firstly exclude routes which have inaccessible NEXT_HOP.

If NEXT_HOP is accessible, then prefer the path which has higher WEIGHT (for Cisco Systems
devices), this is locally significant. In standard implementation, its not a selection criteria.
If WEIGHT is not set or same, then prefer the path which has higher LOCAL_PREF.
If LOCAL_PREF attributes are same, then prefer the path that you advertised by yourself as a BGP
router.
If LOCAL_PREF attributes are same and you dont advertise those routes, then prefer the path
which has the shortest AS_PATH length.
If AS_PATH lengths are same, then prefer the path which has the lowest ORIGIN type; i (IGP;
native) < EGP < ? (incomplete; redistributed).
If ORIGIN types are same, then prefer the path which has the lowest MED (if candidate routes are
announced from the same AS).
If MED is not a tie-breaker, then prefer the eBGP routes over iBGP routes (if any confederation
exists, then selection order becomes; eBGP over eBGP confederation over iBGP).
If routes are iBGP-learned in previous step, then prefer the path which has the lowest IGP metric
for its NEXT_HOP. If routes are eBGP-learned in previous step, then prefer the path which is the
oldest one (means more stable). Also if multipath is enabled in BGP and the same IGP metric exists,
then traffic is loadbalanced.
As a last tie-breaker, prefer the path which has the lowest BGP router-ID.
38
Implementation Topology
Will cover basic configurations as well as advanced scenarios.
Topology seen below will be used for all scenarios, well modify if
we need. Also pysical IP addresses are seen below:
39
Implementation Addresses/Subnets
Other than pysical IP addresses, we will use loopback subnets as customer or
production subnets. We will announce, filter, summarize, redistributeetc. these

subnets. We will play them
Loopback0 will be used as a Router-ID and format is X.X.X.X/32 where X is a
router number (like R1, R3).
Loopback1-9 format is like that 192.168.[X][Y].1/24 where X is a router number
and Y is a Loopback number. E.g. 192.168.53.0/24 subnet belongs to R5Loopback3.
40
Implementation Initial Steps

First initiate the BGP process with an AS number:
R1(config)#router bgp 1230 !BGP process for AS 1230.
Enable BGP peering logging:
R1(config-router)#bgp log-neighbor-changes !In most cases
its on by default, but if its not, its good to turn it
on. With this, BGP neighbor UP/DOWN and reset reasons are
logged as a SYSLOG messages.
Disable the synchronization process:
R1(config-router)#no synchronization !It turns off the
sync.rule (its explained in previous slides).
Disable the auto-summarization:
R1(config-router)#no auto-summary !Its advisable to
disable the auto-summarization.
41
Implementation Peering (1)

Manually define the neighbors and their AS numbers:
R1(config-router)#neighbor 10.0.12.2 remote-as 1230
!Neighbor 10.0.12.2 is in AS 1230, well accept a TCP
connection from this IP and will initiate TCP connection to
this IP.
You can modify NEXT_HOP attribute manually because this
attribute isnt modified for iBGP peerings as i mentioned before,

since inter-router segment is not a customer/neighbor AS subnet,
you dont need to advertise it:
R1(config-router)#neighbor 10.0.12.2 next-hop-self !
NEXT_HOP attribute of UPDATE messages sent to 10.0.12.2 are
modified with the outgoing interfaces IP address of R1
(outgoing interface to reach 10.0.12.2).
42

If authentication is done between peers, same passwords must be
defined in both ends:

R1(config-router)#neighbor 10.0.12.2 password 0 neteksper_lab !If
you use type 7 instead of 0, encrypted form must be entered,
otherwise in type 0 you must enter the plain text.
If eBGP peering is established through the Loopback addresses, TTL of
the IP packet must be changed:

R1(config-router)#neighbor 6.6.6.6 ebgp-multihop 20 !As i mentioned
before, default TTL value is 1 for eBGP peerings, with this command
you change the TTL value to 20, if you dontt enter any number, it
will choose the max.value 255.
If eBGP peering is established through the Loopback addresses, source
IP must be specified:
R1(config-router)#neighbor 6.6.6.6 update-source Loopback0 !OPEN
messages will be sent with this source IP address, since other side
expects to see this address for the peering.
43

Basic peering configurations between R1-R2 and R1-R6 are seen
below as examples:
44

With show ip bgp neighbors command, you can see all information related
with BGP peering. It shows very detailed information.

With show ip bgp summary command, you can see whats the neighbor IP
address, BGP version (4 for current standard), whats neighbors AS number, how
many messages are sent and received to/from this neighbor, table version, input
& output queue values, for how long neighborship is UP and state (if its not
ESTABLISHED) & how many prefixes are received from that neighbor (if the state
is ESTABLISHED it shows the prefix number):
Since 0 prefix has been received, show ip bgp command shows nothing (this
command show the BGP table):
45

Also show ip protocols command shows all routing protocols information
running on the router. BGP global parameters, route reflector clients, used filters
and neighbors can be seen with this command:
46
Implementation Announcing Prefixes (1)

Prefixes can be announced in 3 ways:
With network command,
With aggregation (with aggregate command)
With redistribution (with redistribute command)
In each method, AS_PATH attribute remains
empty (with default parameters), since prefixes

are originated from router itself. Other parameters
can be affected differently (such as ORIGIN).
47
Implementation Announcing Prefixes (2)

With network command, we tell BGP router which routes (with their subnets) are
announced to other BGP peers. This command has several options such as routemap (to set attribute valuesetc.).
R1(config-router)#network 192.168.13.0 mask 255.255.255.0 !192.168.13.0/24
subnet is a connected subnet in Loopback 3, with this command, R1 will
announce this subnet to all BGP neighbors.
With aggregate command, we tell BGP router to summarize more specific routes
which already exist in its BGP table. This command has several options such as
route-map, as-set, summary-onlyetc.
R1(config-router)#aggregate-address 192.168.12.0 255.255.252.0 !192.168.1215.0/24 subnets exist in the BGP table (at least one of them), with this
command, R1 announces summary 192.168.12.0/22 subnet to all BGP neighbors
addition the specific ones.
With redistribute command, prefixes come from any routing information source
(IGP routes, connected or static routes) exist in the routing table can be announced to
other BGP neighbors, this command also has many different options.
R1(config-router)#redistribute connected !All connected subnets exist in the
routing table are redistributed in to the BGP table and announced to all BGP
neighbors.
48
Implementation Announcing Prefixes

with network
With network command, ORIGIN attribute is set to IGP (i), as mentioned earlier, its a preferred value
compared to ? (e is for EGP which is not used anymore). (also other parameters can be changed with
route-maps). In the originating router, NEXT_HOP is 0.0.0.0 since its locally originated. Inside the AS,
AS_PATH is also empty.
Subnet and mask must be exactly same as its seen in routing table (e.g. If entry in routing table is
10.100.2.0/23, subnet and mask must be 10.100.2.0 & 255.255.254.0), otherwise it doesnt work.
Below example shows that R1 announces Loopback 9, then R2 receives this prefix with ORIGIN value i:
49

with redistribute
With redistribute command, ORIGIN attribute is set to incomplete (?), as mentioned
earlier, its not a preferred value compared to i. (route-maps can be used too)
Below example shows that R1 redistributes only Loopback 9 connected interface, then R2
receives this prefix with ORIGIN value ?:
50

with aggregate-address (1)
With aggregate-address <address> <mask> command, ORIGIN
attribute is set to IGP (i) just like network command.

It has a requirement that needs at least one subnet in the aggregation
must be in the BGP table (by network command, by redistribution or by

receiving prefix(es) from an other BGP peer) otherwise aggregated subnet is
not announced.
If aggregate-address <address> <mask> summary-only command
is used, more specific subnets are suppressed in the aggregation range.
If aggregate-address <address> <mask> as-set command is used,
it regenerates AS_PATH information for the aggregated subnet (because
during the aggregation, AS_PATH information is destroyed, its set to empty
just like an internal announcement).
Therere other specific parameters/options for aggregate-address
command, ill try to explain most of them which are used in real world
scenarios.
51

In below example, 192.168.72-75.0/24 subnets (4 subnets) are aggregated to
one 192.168.72.0/22 subnet by R1, but also more specific subnets in the
aggregation are received by R2 additon to aggregated subnet, also AS_PATH
information is restored by R1 (since as-set keyword is added):
52

In below example, 192.168.72-75.0/24 subnets (4 subnets) are aggregated
as one 192.168.72.0/22 subnet by R1, more specific subnets are suppressed

by R1(with summary-only keyword) and AS_PATH information isnt restored
by R1. So R2 only receives 192.168.72.0/22 with no AS_PATH information. In
R1s BGP table, suppressed routes are tagged with s:
53
Implementation Announcing Prefixes with

suppress-map & unsuppress-map
Also with suppress-map, some specific routes can be selectively suppressed rather than all
specific routes. The syntax is:

R1(config-router)#aggregate-address <address> <mask> suppress-map
<route-map>
For that, firstly we define subnets, then we match these subnets with the route-map and
finally we assign this route-map as a suppress-map in the aggregation.
Matched routes ARE NOT announced to the neighbors, theyre tagged with s in the BGP
table of the router itself (who performs the aggregation).
Suppress-map cant be defined as a neighbor basis, it affects globally all BGP neighbors.
Also with unsuppress-map, some specific routes can be selectively announced.
As you guess, this can be defined as a neighbor basis, you can selectively send specific
subnets to any neighbor.
The configuration of unsuppress-map is same (defining subnets, matching them, applying
unsuppress-map). The difference is here that matched subnets ARE announced to the
neighbor. The syntax is:
R1(config-router)#neighbor <address> unsuppress-map <route-map>
Next 2 pages you can see both suppress-map and unsuppress-map cases between R1 &
R2.
54

with suppress-map
In below example, 192.168.72-75.0/24 subnets (4 subnets) are aggregated as one
192.168.72.0/22 subnet by R1, and only 192.168.72.0/24 and 192.168.75.0/24 subnets are
suppressed (in other words 192.168.73.0/24 and 192.168.74.0/24 subnets are announced
addition the aggregated /22 subnet).
55

with unsuppress-map
In below example, 192.168.72-75.0/24 subnets (4 subnets) are aggregated as one
192.168.72.0/22 subnet by R1, more specific subnets are suppressed by R1(with summaryonly keyword), only R2 receives 192.168.73.0/24 and 192.168.74.0/24 subnets in addition
to the aggregated /22 subnet:
56
Implementation Conditionally Announcing

Prefixes
For conditionally announcing prefixes, we use advertise-map, exist-map and
non-exist-map.
Advertise-map defines which routes will be announced in the case of
meeting the condition.
Exist-map and non-exist-map are used to define the condition.:
If exist-map command is used with advertise-map command, this means
that "announce prefixes defined in advertise-map ONLY IF prefixes defined in existmap exists in the BGP table"
If non-exist-map command is used with advertise-map command, this
means that "announce prefixes defined in advertise-map ONLY IF prefixes defined
in non-exist-map DOES NOT exist in the BGP table"
Conditional announcing is used as a neighbor basis. Syntax is:
R1(config-router)#neighbor <IP> advertise-map <route-map> exist-map
<route-map>
R1(config-router)#neighbor <IP> advertise-map <route-map> non-exist-map
<route-map>
Next pages will show both cases.
57
Implementation Conditionally Announcing Prefixes

with advertise-map & exist-map (1)
Below example shows that, if 192.168.72.0/21 subnet exists in BGP table of R1, 192.168.72-74.0/24
subnets (3 subnets) are allowed to advertise to neighbor 10.0.12.2 (addition to other subnets). But if
192.168.71.0/21 disappears from R1s BGP table, these 3 subnets are not advertised to this
neighbor (other subnets are still advertised). Next page shows that condition is not met.
58

with advertise-map & exist-map (2)
Here condition is not met (192.168.72.0/21 subnet does NOT exist in R1s BGP table). As you see, 3
subnets are NOT sent to the neighbor 10.0.12.2.
59

with advertise-map & non-exist-map (1)
Below example shows that, if 192.168.72.0/21 subnet does NOT exist in BGP table of R1,
192.168.72-74.0/24 subnets (3 subnets) are allowed to advertise to neighbor 10.0.12.2 (addition to

other subnets). But if 192.168.71.0/21 exists in R1s BGP table, these 3 subnets are not advertised
to this neighbor (other subnets are still advertised). Next page shows that condition is not met.
60

with advertise-map & non-exist-map (2)
Here condition is not met (192.168.72.0/21 subnet exists in R1s BGP table). As you see, 3 subnets
are NOT sent to the neighbor 10.0.12.2.
61
Implementation Conditionally Route

Injection (1)
Conditionally route injection means that, BGP router originates specific subnets from
the aggregate/summary.
For this purpose inject-map and exist-map are used.
Inject-map defines prefix that will be originated from the aggregate.
Exist-map matches aggregate and source of the aggregate.
Command syntax is:
R1(config-router)#bgp inject-map <route-map> exist-map <route-map>
Also if copy-attributes keyword is added to the above command, original
attributes are copied to the injected more specific subnet (normally theyre not copied).
In inject-map, prefix-list is set. In exist-map, source and aggregate are matched:
R1(config)#route-map INJECT_MAP permit 5
R1(config-route-map)#set ip address prefix-list <specific_subnet>
R1(config-route-map)#route-map EXIST_MAP permit 5
R1(config-route-map)#match ip address prefix-list <aggregate_subnet>
R1(config-route-map)#match ip route-source prefix-list <src_of_aggregate>
62
Implementation Conditionally Route

Injection (2)
Here condition is met (R1 learns 192.168.72.0/21 aggregate from R3 which is 10.0.13.3), so R1
injects 192.168.73.0/24 from the /21 aggregate. R2 receives both aggregate and specific subnet,
but since we didnt add copy attributes keyword, as you see attributes are not set for
192.168.73.0/24 subnet.
63
Implementation Route Reflectors (1)

Route Reflector theory and traffic flow between RRClient
RouterNon-Client Router is explained before in theory

section.
Configuration syntax is very basic, assume that R1 is
route reflector for R2 & R3, command syntax is:
R1(config-router)#neighbor 10.0.12.2
route-reflector-client
R1(config-router)#neighbor 10.0.13.3
route-reflector-client
Theres no specific configuration in client router, standard
neighbor configuration is applied
64
Implementation Route Reflectors (2)

Below example shows that R1 is configured as an RR, R2
& R3 are configured as an RR clients.
65
Implementation Confederations (1)

Confederation theory is explained before in theory section.
Configuration syntax is very basic, BGP process is started
for sub-AS, main AS is defined under BGP process and for

each peered sub-AS, they are defined under BGP process
too.
R1(config)#router bgp 64520 !initiates BGP process for
sub-AS 64520
R1(config-router)#bgp confederation identifier 1230
!defines main AS
R1(config-router)#bgp confederation peers 64530 !means R1
has a peering with a router in sub-AS 64530, this has to
be defined as a confederation peer.
Everything else is same, eBGP peerings, iBGP peeringsetc.
66
Implementation Confederations (2)

Below example shows that, R1 & R2 are in sub-AS 64520 and R3 is in sub-
AS 64530. All those routers are in AS 1230. In confederations, NEXT_HOP

attribute is not modified even if peering is eBGP, for that, it has to be modified
manually (with next-hop-self keyword). Also sub-AS is seen in BGP table
in paranthesis:
67
Implementation Traffic Manipulation (1)

Traffic manipulation can be divided into 2 sections; outbound
traffic manipulation and inbound traffic manipulation.

Outbound traffic manipulation deals how data traffic
leaves your AS.
Inbound traffic manipulation deals how data traffic arrives
at your AS.
To manipulate traffic, you have 2 choices; you can change
BGP attributes, you can announce/filter subnets by utilizing
route decision mechanism of the router based on longer
prefix criteria.
68
Implementation Traffic Manipulation (2)

BGP attributes used for path selection are seen in order below:
WEIGHT (if your equipment is Cisco) (higher is preferrable)
LOCAL_PREF (higher is preferrable)
AS-PATH (shorter is preferrable)
MED (Metric) (lower is preferrable)
First two of them are used for outbound traffic manipulation, applied as an
inbound policy on the router.

Last two of them are used for inbound traffic manipulation, applied as an
outbound policy on the router.
As you see, you have control over outbound traffic (if peer AS sets
LOCAL_PREF during the reception of the prefixes from us, it doesnt matter how
we set AS-PATH and MED values during the advertisement of these prefixes to
peer AS, because peer AS router checks LOCAL_PREF before AS-PATH or
MED).
After the attribute(s) are set, you may trigger it by soft or hard resetting the BGP
peerings, it depends on the attribute, software, vendoretc.
69
Implementation Outbound Traffic

Manipulation (1)
To manipulate outbound traffic, you can change LOCAL_PREF or WEIGHT (if
your equipment is Cisco) attributes and apply them as an inbound policy.

This applied inbound policy affects outbound traffic.
You decide for which prefix(es) you will change attribute(s) to manipulate traffic
flow (e.g. by defining a prefix-list), then you create a route-map to match prefix(es)
& set attribute(s), finally you apply this policy under BGP process for a specific
neighbor.
You can find the syntax below:
R1(config)#route-map <X> permit 10
R1(config-route-map)#match ip address prefix-list <Y> !matches the subnets.
R1(config-route-map)#set local-preference <value> !sets the LOCAL_PREF
attribute.
R1(config)#route-map <X> permit 20 !accepts the other remaining subnets,
but does not modify anything on them.
R1(config-router)#neighbor <a.a.a.a> route-map <X> in !under BGP process,
inbound policy is applied for the neighbor.
70

Manipulation (2)
Below example shows that, R1 sets LOCAL_PREF attribute to 300 for 192.168.41.0/24, 192.168.42.0/24, 192.168.43.0/24, and
this policy is applied for neighbor R6 as an inbound policy. So, R1 receives those subnets from R6, and R1 modifies that
attribute for 3 subnets and announces to iBGP peers. Now all iBGP peers know that for any traffic going to these 3 subnets,
they will route them to R1. (as you see below, even if R3 is connected directly to R4, it chooses R1 R6 R5 R4 path) R1
changes the outbound traffic by doing this, all outbound traffic for these 3 subnets will go through R6.
71

Manipulation (3)
Below example shows that, R1 sets WEIGHT attribute to 300 for 192.168.41.0/24, 192.168.42.0/24, 192.168.43.0/24, and this
policy is applied for neighbor R6 as an inbound policy. So, R1 receives those subnets from R6, and R1 modifies that attribute
for 3 subnets. But in this case, this only affects R1, because as you remember WEIGHT is only locally significant and is not
announced. R1 sends traffic to R6 for these 3 subnets but other iBGP peers dont, theyll use R3 to reach them. R1 changes
the outbound traffic by doing this, all outbound traffic for these 3 subnets will go through R6.
72
Implementation Inbound Traffic

Manipulation (1)
To manipulate inbound traffic, you can change AS_PATH or MED (Metric)
attributes and apply them as an outbound policy.

This applied outbound policy affects inbound traffic.
You decide for which prefix(es) you will change attribute(s) to manipulate traffic
flow (e.g. by defining a prefix-list), then you create a route-map to match prefix(es)
& set attribute(s), finally you apply this policy under BGP process for a specific
neighbor.
You can find the syntax below:
R1(config)#route-map <X> permit 10
R1(config-route-map)#match ip address prefix-list <Y> !matches the subnets.
R1(config-route-map)#set metric <value> !sets the MED attribute.
R1(config)#route-map <X> permit 20 !advertises the other remaining subnets,
but does not modify anything on them.
R1(config-router)#neighbor <a.a.a.a> route-map <X> out !under BGP process,
outbound policy is applied for the neighbor.
73

Manipulation (2)
Below example shows that, R1 sets AS_PATH attribute by prepending 3 more AS numbers addition to original AS (1230 1230
1230 + 1230) for 192.168.11.0/24, 192.168.12.0/24, 192.168.13.0/24, and this policy is applied for neighbor R6 as an outbound
policy. So, R1 advertises those subnets to R6 with a modified AS_PATH attribute, and R6 sees that those subnets are 4 AS / 4
hop away to reach. R6 also sees 4570 1230 for same subnets from R5, so R6 prefers R5 to reach them. R1 changes the
inbound traffic by doing this, all inbound traffic for these 3 subnets will enter AS 1230 through R3.
74

Manipulation (3)
Below example shows that, R1 sets MED (Metric) attribute to 1000 for 192.168.11.0/24, 192.168.12.0/24, 192.168.13.0/24, and
this policy is applied for neighbor R6 as an outbound policy (since AS_PATH has a higher preference than MED, to make both
paths same, weve prepended one more AS to our advertisements for 3 subnets). So, R1 advertises those subnets to R6 with a
modified MED (Metric) attribute, and R6 sees that those subnets have a MED (Metric) value 1000. R6 also sees 0 MED
(Metric) for same subnets from R5, so R6 prefers R5 to reach them. R1 changes the inbound traffic by doing this, all inbound
traffic for these 3 subnets will enter AS 1230 through R3.

BGP Explained

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

BGP Explained

Hochgeladen von

Copyright:

Verfügbare Formate

1

To understand theory, basic data communication + clear

routing knowledge is a pre-req.

General Information (1)

General Information (2)

Theory General Concepts (1)

that uses TCP port 179. So, to open a TCP connection;

Theory General Concepts (2)

iBGP (internal-BGP; BGP neighbors are in the same AS)

Theory General Concepts (3)

BGP messages sent/received to/from neighbors, address-family and denied prefixes

Theory General Concepts (4)

neighbor can not be advertised to an other iBGP peer.

Theory General Concepts (5)

"reflector", these routers advertise routing information to clients to non-clients

Theory General Concepts (6)

64512-65534 can be used as a private AS (e.g. in

confederation deployments, as a sub-AS).

Additionally ROUTE REFRESH message

Theory OPEN Message

has been established, firstly an OPEN message (type 1)

Theory OPEN Message Structure

Theory KEEPALIVE Message

KEEPALIVE messages (type 4) are sent every 60 seconds

Theory KEEPALIVE Message Structure

Theory UPDATE Message (1)

established, secondly BGP peering has been established with OPEN

Theory UPDATE Message (2)

As you see above, theres a succesful TCP connection

and after that theres a successful BGP peering

Theory UPDATE Message Structure

UPDATE message format with new routes:

Theory NOTIFICATION Message

problem. This message closes the BGP connection.

Theory NOTIFICATION Message

Also in the below example, wrong neighbor AS number

has been configured, which means other side doesnt

Theory ROUTE REFRESH Message

informs BGP peer to exchange prefix information again which are

Theory ROUTE REFRESH Message

unicast IPv4 traffic:

As soon as its sent, UPDATE messages are exchanged

Theory IDLE & CONNECT State

resources and during this time, router does not

Theory ACTIVE & OPENSENT State

successful, BGP router sends an OPEN message

Theory OPENCONFIRM &

OPEN message from its peer and is waiting a KEEPALIVE

Theory BGP Finite State Machine

Theory Well-Known Mandatory

advertises the UPDATE message. Doesnt change when

Theory Well-Known Mandatory

Theory Well-Known Discretionary

which path they should select to reach a specific NLRI

Theory Well-Known Discretionary

is 100 for this/these NLRI(s) and NLRIs are aggregated

Theory Optional Transitive Attributes (1)

number of the BGP router that performed the

Theory Optional Transitive Attributes (2)

in AS 1230 has aggregated this NLRI and this packet has

Theory Optional Non-Transitive

peers, according this attribute, neighbor AS will select the entrance

Theory Optional Non-Transitive