EECS 600 (Internet Security) Notes

1. 2016-04-12 Tuesday
2. 2016-03-03 Thursday
3. 2016-03-01 Tuesday
4. 2016-02-25 Thursday
5. 2016-02-23 Tuesday
6. 2016-02-16 Tuesday
7. 2016-02-18 Thursday
8. 2016-02-11 Thursday
9. 2016-02-09 Tuesday
10. 2016-01-28 Thursday
11. 2016-01-26 Tuesday
- 11.1. Virtual Private Networks
- 11.2. Firewalls
12. 2016-01-21 Thursday
13. 2016-01-19 Tuesday
- 13.1. DNSSEC
- 13.2. X.509 Overview
14. 2016-01-14 Thursday
15. 2016-01-12 Tuesday
- 15.1. Logistics
- 15.2. Encryption and Authentication

1 2016-04-12 Tuesday

New Internet Architectures

We've looked at new architectures that prevent DDoS traffic.
However, there are plenty of other unwanted types of traffic online.
We should try to design architectures that solve as many of these problems as possible.

Desired qualities of a new architecture:

Un-spoof-able
Destination gets control over what traffic it receives
Unwanted traffic is dropped closer to the sender (so that it doesn't consume too many network/destination resources)

Background: Capabilities

Capabilities are the basic idea behind the two papers we read on new architectures so far (TVA and Portcullis).
Capability = authorization to communicate with a host.
- Issued by the recipient. Specific to the path the traffic takes, and enforced by routers. Expires quickly.
Fundamental flaw: need to request a capability. By definition, the capability request traffic can't be protected by capabilities, and thus is vulnerable to a denial of capability attack (flooding the capability request channel to starve out legitimate ones).
- TVA tries to address this by using fair queueing.
  - Conference paper just used the incoming link of the router for fair queueing.
  - Journal paper used hierarchical fair queueing, which uses the path tags to queue requests.
- Portcullis tries to address this with proof-of-work traffic.

Evasive Internet (EIP)

The only way you can talk to a host is with a capability.
Hosts themselves don't handle the capability request, DNS architecture does!
DNS servers are hosts as well, so you need capabilities to speak with them too!
- You get capabilities to speak with a DNS server from a higher-level DNS server.
Roots are left "unprotected" since they are actually quite well provisioned due to multicast routing.
In order to talk to a host, you request a capability from DNS.
In other words, a capability is identified by a "transient" address.
T-Address:
- Contains source/destination IP, as well as capabilities.
- Bound together by a cryptographic signature from RPKI.
A host may issue a T-address, or a host's authoritative DNS server may do it.
Communication flow:
- Host gets IP address, LDNS, certificate from DHCP server. Maybe even gets a T-address for the LDNS to ease things.
- Host asks LDNS for T-address for a hostname.
- LDNS can go to ADNS for permission, or it can have permission delegated to it so that it can sign T-addresses for itself.
- There is a source/destination t-addr pair.
- The initial packet contains t-address. Subsequent packets are type 2 frames, which only have a 16B flow label.
Questions
- MTU:
  - Bandwidth and MTU sizes are reaching the point that it won't be an issue. But right now, with some finagling, everything can fit into 1500 bytes. There is a wasted packet though.
- Traffic overhead
  - Around a max of 6% overhead in traffic.
- Computation overhead
  - Not really answerable without resources to do hardware for this. Initial results are on server, implemented in software on top of linux + netfilter.

2 2016-03-03 Thursday

Web Scams Continued

Hiding Content

Can conditionally serve different content by using the user-agent field.
Hide content from user by making their spammed terms/text invisible.
Hide content from search engine by:
- Redirecting - using a refresh meta tag.
- Redirecting - using a script
- Or, probably just dynamically loading content using JS frameworks :/

Fast-Flux Service Network (FFSN)

Hiding web servers (this is especially relevant for phishing)
Use botnets as CDNs
Almost 30% of all domains advertised in spam are FFSN.

Privacy, Anonymity, Tor

Internet is a public network. Spoofing is pretty easy to do through a variety of ways. The routing information necessary for the internet is public. Even encryption doesn't help that much, because it encrypts the payload of your communication, but not the source/destination. And the source/destination of your traffic is pretty revealing already.

Anonymity = the people who are communicating cannot be identified. This is different from confidentiality, where the content of your communication is secret.

Onion Routing

Sender chooses a sequence of routers
Some of these routers may not be trusted, but that's okay
The sender encrypts the routing information in a layer-wise fashion, such that each router only knows the next router, and a piece of encrypted information to send to the next one.
You always need at least three intermediaries.
Also, a proper router should wait until it has a few messages to send at the same time, so that an observer can't just look at its traffic and determine the path for a person's message.
Both of the above considerations incur heavy latency.

Tor

An implementation of onion routing, focusing on anonymous communication.
Not steganographic - it's easy to know that a client is using Tor.
Also, not difficult for a censor to identify all Tor nodes and block them.
- Either by enumerating nodes from directory.
- Or by simply probing IP addresses to see who runs Tor.
Tor is an overlay network, and communication between nodes happens with TLS on TCP.

To create a Tor virtual circuit:

Talk to entry node, do Diffie-Helmann to agree upon a symmetric key.
Then, proxied through the entry node, do Diffie-Helmann with your next selected node to agree upon another symmetric key.
Continue doing this, proxying through your existing VC, adding new nodes to your circuit, until you're happy.

To send message:

Client builds circuit as above.
Client then encrypts the message with each symmetric key in reverse path order. Then sends message through TLS to first node.

Usage:

Many applications can share a circuit, since making a circuit takes time.
Tor routers don't need root.
You're safe so long as not all of your routers are controlled by the same adversary.
Sybil attack is when an adversary submits a lot of routers to the directory in order to increase the likelihood of controlling all nodes in a path.

Hidden Services

You want to host a site that nobody can find.
Your service is identified by a public key.
You choose a bunch of introduction points and set up onion routes to each.
Clients up a route to a "rendezvous point" (special piece of software)
Then, client communicates that information to an introduction point (probably also through an onion route), which conveys it to you.
You set up your own onion route to the rendezvous point.
The rendezvous point "splices" the communications together, and you're talking.

3 2016-03-01 Tuesday

Spam, Continued

Forging email from addresses.

You can forge a from address in the SMTP MAIL FROM command. This is used by each SMTP server along the delivery path. Each of the servers can use this MAIL FROM command to bounce messages when necessary. This address is saved in the Return-Path header, so that mail servers can send bounced messages back to the "correct" sender.
You can also forge a from address by putting it in the From header. This is usually what's presented to the user, and this can deceive users.

Reflection attacks using forgery:

Send a message spoofing the MAIL FROM to an invalid address.
The destination mail server will "bounce" the message back to the return path.
It will send a bounce message to the spoofed address.

This technique can be used to deliver spam, and also pollute the storage of mail servers.

There are techniques to validate both the MAIL FROM command and the From header.

Sender Policy Framework (SPF)

This technique protects the MAIL FROM command.
When a SMTP client says MAIL FROM: person@example.com, the server verifies the client IP address is allowed to send mail from example.com.
To do this, there is a SPF DNS record that will contain IP addresses of outbound MTAs for a domain.
Problem: what about using relays? Say, there's a MTA in between sender MTA and recipient MTA? In that case, the relay MTA will not pass SPF.
- Well, the relay can construct a new email address at their domain, and then deploy SPF on their domain. This is called envelope rewrite.
- However, this means that the recipient MTA will have to trust that the relay MTA has done their due diligence for checking SPF. They cannot verify the origin MTA's SPF.
Other limitations:
- Doesn't prevent forgery of "From:" header.
- Doesn't prevent forgery of addresses within the origin domain.
- Single broken (non SPF verifying) relay defeats the whole scheme.

Domain Keys (Aka DKIM)

This technique protects the From header.
Sender publishes their DKIM key using DNS.
Sender adds a DKIM-Signature header, which gives a signature of the message.
- Includes the signature algorithm, as well as a list of headers (in order) that should be prepended to the message before verifying the signature.
Limitations (mirroring SPF's limitations):
- Doesn't check SMTP commands: return-path, or recipients. So even though the To header is included in the signature, the RCPT TO command is not verified.
- This means that if a spammer succeeds in getting a single message signed, they can send it out to tons of recipients, and it'll appear valid.

Other Techniques

Detect spam based on campaigns - when the same message is sent many times.
- Server can store hashes of messages and note when multiple recipients get the same message.
- Server can also do P2P messages to other MTAs to verify whether other recipients at those servers have received a message with the same hash.
Phishing Detection
- Fetch referenced page.
- Extract 5 representative terms out of the page (TF-IDF).
- Google search them, and compare the hostname of the link to the top 30 Google results.

Web Spam

Goal: attract a viewer through search.
Grey area: search engine optimizers.
Two attacks: boost your search result, or hide other search result.

Boosting:

Term spamming
- Putting terms into text fields and headers of the page.
- Put it into the URL.
- Put it into links pointing at the page.
- Put lots of terms in the page, to make it relevant to many queries.
- Or, repeat a term many times to make your page rank higher for that term.
Link spamming
- Publish lots of blog entries that link to your URL.
- Create websites and pages that link to you. There is a market for selling links from your pages to other people's pages, in return for getting people to link to your page.
- Honeypot: put a useful resource online, and include hidden links to spam pages.

4 2016-02-25 Thursday

We're finished with Denial of Service Attacks. Now, we're going to talk about spam.

Spam, Ham, Scam

In particular, today we'll talk about email spam.

Spam = unwanted mail, containing ads, phishing, or anything else that will get you to click and get infected.
It has a low uptake rate - about 1%. Spammers don't seem to mind, they just increase volume.
Techniques for blacklisting:
- Blacklist email addresses.
  - Can be spoofed
- By IP address of sending machine
  - difficult to spoof
  - Can be circumvented with botnets, etc
- Block website
  - Can be circumvented by fast-flux networks.

Spam Protection

Content filtering
Message filtering
- That is, filtering base on non-content features.
- For instance, distance between sender and recipient.
- Non-content features are harder for attackers to spoof.
Transfer-time games
- Receiver may increase download time for message.
- For normal senders, this isn't a big deal.
- For spammers, time is money.
Monetary approaches
- Make the sender pay to send.
- Make the sender post a bond that is forfeited if the recipient claims spam.
Originator blacklisting

Content & Message Filtering

Run software on your receiving SMTP server. Extracts features and classifies as messages come in.
Content-based features:
- HTML w/ lots of hyperlinks
- Lots of heavy formatting, red text, big fonts, etc
- Exclamation marks
- Lots of misspelling
- Specific keywords
Transport-based features.
- Path latency, bandwidth
  - Latency correlates with distance, bandwidth correlates with location (home vs server)
- Inter-packet arrival jitter
- Times that a zero-window was advertised (interesting…)
Network-based
- Dynamic originator address
- Source AS number
- Distance of source from recipient
- Residual TTL in IP header.
Per-message features
Aggregate features
- Per-sender: sender age, sending patterns
- Across senders: common content
- Cross recipients: cooperative filtering

Spam Filter Pro/Con

Pro
- Autonomously applied
- Effective
- Killer app for machine learning (naive bayes woo!)
Con
- Constant arms race
- False negatives - Lets some spam through (encourages more spam)
- False positives - block legitimate mail (change email usage)

Blacklisting

As discussed above, based on email, IP, etc.
Open-relay SMTP servers are very prized by spammers, very heavily blacklisted.
Blacklists can be maintained via DNS:
- Query: A 12-34-56-78.emailblacklist.com?
- Response (OK): 12-34-56-78.emailblacklist.com A 10.0.0.0
- Response (BL): 12-34-56-78.emailblacklist.com A 10.0.0.1

Address Aliasing

Multiple email addresses, something we commonly do.
There are actually services that create email addresses with policies (number of emails, length of time valid, etc). These services will enforce these policies, forward messages that adhere, and take care of everything for you.

5 2016-02-23 Tuesday

Content Delivery Networks: Protection or Threat?

CDNs claim to provide protection against "flash crowds" as well as against application-level DDoS attacks. We will see that their claims are a bit overblown, and that CDNs can in fact be brought down by DDoS attacks.

CDN Background

Main site (example.com) delegates one or more domain to the CDN via DNS CNAME.
CDN has distributed network of edge servers, and clients are directed to an edge server by their DNS.
CDN serves "additional" files such as JS, CSS, images, etc. Some amount of dynamic content usually must be served by the edge server, but the rest is cached by the CDN.

Contribution of this paper: an attack that penetrates CDNs, and actually uses them as accomplices in an amplification attack:

Need to be able to scan the CDN platform to discover edge servers.
Need to be able to obtain service from an arbitrary edge server (not just the one the CDN chooses for you).
Need to penetrate the CDN cache, ensuring that the origin server is contacted each time.
Finally, the attack should provide amplification.

Enumerating edge servers:

Simply perform DNS resolutions all over the world.

Penetrating the edge server cache:

First idea: HTTP Cache-Control header. Instruct edge server not to use cache.
- Many CDNs do not honor the header.
Solution: add random string after the ?. The CDN cache may treat the whole URL path as a key into their cache, including the parameters. So a random string would cause a cache miss.
To test this hypothesis, run three tests:
- Prewarm cache for a particular object by making a request. Then make many subsequent requests and record throughput.
- Make an initial request for the object with a random string. Record throughput.
- Make a subsequent request for the object with the same random string. Record throughput.
- Hypothesis: first and second tests would have different means (t-test), as well as second and third. First and third would have the same mean.
- Hypothesis was confirmed.
This avenue for attack can be used against both the CDN as well as the origin site:
- Repeated requests with random strings could pollute cache, eventually evicting valid resources and degrading performance. It is possible for the CDN to mitigate this attack (using hashes, bloom filters, etc), but intuition says that they might not right now.
- Repeated requests against the same origin can be used to take down origin site.

Amplification!

There are many ways to reduce your own bandwidth consumed during this attack:
- Advertise small receive window to throttle connection to CDN.
- Or, just cut (RST) connection once you start receiving data from the CDN (ensuring that they are now consuming origin's bandwidth).
- Do both, (advertising 256 byte window), and you can ensure that you are using very minimal bandwidth from your own computer.
- Meanwhile, if you can confirm that the requested object is present in the cache, you can confirm that the edge server has downloaded the object in full from the origin, consuming much more bandwidth. We can use the same test as above (based on throughput) to confirm that the object does in fact get cached.

So, full attack description:

Get the list of IP addresses of the edge servers.
Determine a good-sized object on the victim site.
Attacker sends requests with random strings appended to each of the edge servers.
Attacker also sets receive window to 256 bytes, and resets connections after receiving the first data segment.

In order to test this, the researcher also hosts the victim server, and can monitor the web server for its quality of service.

Used Coral CDN, which is a research CDN that can be used for this purpose. Two attacks: sustained and burst. Burst tripped rate limiting of CDN, but sustained attack was not rate limited.

The sustained attack achieved 71x decrease in responsiveness from the server. The burst attack achieved 3 orders of magnitude (1000x) decrease for a shorter period of time. The amplification was an order of magnitude.

Response from Akamai:

They provide an API to give URL patterns that should be cached. So, a website could instruct Akamai not to forward requests with additional GET parameters on URLs which are not expected to have GET parameters.

6 2016-02-16 Tuesday

DDoS Defense by Offense

7 2016-02-18 Thursday

Thinning Akamai

8 2016-02-11 Thursday

Web Timeouts and Their Implications

Types of (D)Dos

Application attacks
- frequently HTTP
- grab-and-hold
TCP SYN
Reflection attacks
- ICMP, TCP SYN, DNS, NTP
- Amplification

Recent Trends

Purpose of this class: to dispel the false sense of security that "we have the security measures, we are fine."
Security is not a solved problem, it's an ongoing battle
Further, we are not winning
- As federal spending on "cyber defense" increases linearly, the amount of security incidents against federal agencies increases super-linearly.
- Corporate attacks are very common, and cost huge amounts of money.
- Attacks on national infrastructure are becoming much more common.
With impending integration of our everyday lives on the Internet ("Internet of Things"), this could be a "disaster in the making".
Top 85% of attacks
- HTTP Flood, TCP Flood, and DNS Flood
- DNS flood is the top attack (42% of all attacks)

Claim and Hold DoS Attacks

The idea
- Claim a resource on the server, and hold onto it for a long time.
- Because the request is well formed, it is difficult to filter out.
Examples
- Slowlory: attacks HTTP servers, sending requests one byte at a time.
- Slowdroid: Attacks any TCP server by sending an endless sequence of spaces
Web server example:
- Opportunities:
  - Send a SYN, wait for completed handshake.
  - After handshake is completed, start transmitting request s-l-o-w-l-y
  - After request is completed, consume the response s-l-o-w-l-y (using TCP small receive window, and small amounts of acknowledgements).
  - After response is completed, hold onto the the connection as long as possible.
- Server has to combat these attacks with timeouts:
  - Application timeout: time from completing handshake to receiving first byte of request
  - Request timeout: time from receiving first to last byte of request
  - Response timeout: time for the client to consume the entire response
  - TCP timeout: time the server holds onto the connection without receiving ACKs.
  - Keep-Alive timeout: time the server will keep a connection alive after the response is completed.

TCP Timeout

Time server allows for client to ACK a data packet.
Send a HTTP request, disappear, and measure one of the following
- Time of last retransmission (this is a lower bound on the timeout, 61% of servers).
  - The majority of these servers persist over 100 seconds!
- Or FIN (9% of servers).
  - The majority of these persist under 10 seconds, but some go longer.
- Or RST (30% of servers).
  - The majority of these servers persist over 100 seconds as well!
To measure the effect of reducing the TCP timeout, two web servers were deployed:
- One which used the regular Linux TCP timeout: 15 retransmissions, exponential backoff, 13-30 minutes total!
- One which used 3 retransmissions, ~600 ms (each?).
- Dropped connections due to timeouts increased by 0.16%. So barely at all.

Application Timeout

Time the server allows from completing the connection handshake until receiving the first byte of the HTTP request.
Method: Open a TCP connection, don't send a request, and observe how long we can keep it open.
About 36% of sites don't terminate after 20 minutes of waiting. This may be due to the TCP DEFER ACCEPT option of TCP stacks, which allows the server to defer "accepting" the TCP connection until the first byte of the request comes in.
Many sites have timeouts over 100s.
99% of requests happen within a second of the handshake completion. This is a huge disconnect from the allowed timeouts.

Request Timeout

Time from receiving first byte of the HTTP request to receiving the last byte of the request.
Method: open connection, send 1000-bytes HTTP GET request, one byte per segmet, one segment per second. Observe if (or when) the connection is terminated.
Majority of sites didn't close connection.
85% of requests fit within 1 packet.
99.9% of requests take less than 1 second to be transmitted.

Response Timeout

Time for client to consume entire response (bytes/second)
Method: open a connection, send a request, consume the response at 100 bytes/sec. Observe if server terminates the question.
Only 24% of sites impose a limit on response rate.
- This corresponds to roughly the fraction of IIS servers in the population, which is the only web server known to impose a response rate limit.
Almost 99% of responses are consumed at a rate of at least 10 Kbps.

Keep-Alive

This is much better addressed, since many people are aware of this timeout and restrict it.

9 2016-02-09 Tuesday

Exit from Hell? Reducing the Impact of Amplification DDoS Attacks

Read dat paper yo.

10 2016-01-28 Thursday

Last time - firewall. Distinguishing feature: interposed in your path to the network.

Today: Intrusion Detection System. These inspect packets, do "deep packet inspection", and try to correlate them together to identify attacks and intrusion attempts.

May have multiple IDS's. For instance, you may set up a DMZ for publicly accessible servers, and have a IDS for them. Then, in the private network, another IDS. Finally, you may put further IDS's closer to valuable assets in order to improve your chances of detection.
Taxonomy of IDS's:
- Signature based vs anomaly detection
  - Signature based: has classifiers for several different attack types. If incoming traffic is classified as an attack, raise an alarm.
    - These can be more accurate, because they can describe specific attacks that may not look anomalous.
    - However, they cannot detect new attacks.
  - Anomaly detection: creates statistical models for "normal" traffic. Then raises an alarm when traffic "sticks out" from the normal model.
    - Can detect new attacks
    - So many false positives
- Network IDS vs Host IDS
  - NIDS: sniff packets on a whole network. Requires lots of computing power. But, this means that they are monitoring an entire network segment.
  - HIDES: Analyzes only at the host-level. Can use OS-level events (even TCP state tracking). Usually deployed on high-value targets.

Evasion

Searching for text or patterns within TCP connections is hard.
- Can't look just for text in a packet (could be spread across multiple)
- Can't look just for text in sequential packets (could be reordered)
- Could we reassemble the byte stream? Tricky…
Defining patterns to search for is hard
- For directory names, could substitute a ton of variations on valid path names.
- Or could use character escapes.
Packet fragmentation!
As an attacker, any of these variations can be combined.
Additionally, IDS's fail "open", such that the network traffic continues to flow if an IDS is overwhelmed. So one common attack is to disable the IDS by volume or some other way, and then carry on your actual attack without detection.
TTL Exploit:
- An attacker may send multiple TCP segments with the same sequence number, but different payloads, and different TTL fields.
- The ones with lower TTL fields will be dropped, so the end host will only see their intended text.
- But IDS will see all the different segments, and have trouble piecing together the true message.
Fragmentation Attack:
- Overlap:
  - Send IP fragment F1 and F2, where F2 overlaps the end of F1.
  - This is "prohibited" - it should never happen.
  - But, OS's deal with it in different ways.
    - Some use the data from F1
    - Some use the data from F2
    - Some use the first packet to arrive.
  - IDS can't guess behavior of every end system
- Reassembly:
  - Network stacks need to hold onto packets that are fragmented, so that when they receive the next one, they can piece them together.
  - They have a timeout on their buffers, so that they don't hold them indefinitely.
  - If the end host has a longer timer than the IDS:
    - Send F1, wait for the IDS to timeout, then send F2
  - If the IDS has a longer timer than the end-host:
    - Prepare fragments F1, F2, F2', such that F1F2 is an attack, but F1F2' looks ok.
    - Send F1
    - Wait for end-host to timeout.
    - Send F2'
    - … (slides)

Port Scan Evasion:

If a computer (Patsy) has deterministic IP IDs, an attacker can send a ping stream to view the IP ID state. Then, they can send TCP SYN's to a victim, spoofed as the Patsy.
If the Victim's port is open, they will SYN ACK to the Patsy, and the PATSY will RESET back, which uses in IP ID. The attacker can observe this and have good evidence that the port is open.

IDS Guidelines

Preventing TTL attacks:
- Deploy as close to destination hosts as possible
Preventing fragmentation attacks:
- Drop fragmented packets at entry
- Not a great solution! What about tunneling? Fragmentation regularly happens when tunneling traffic.
Bifurcation attacks
- Try to analyze all possibilities.
- Also not a good solution.

Traffic Normalizer

Sit in the path of traffic, and resolves ambiguities.
For fragments, it will use the shortest reassembly timer, and drop the next packet fragments after the timeout so nobody gets them (either IDS or end host)
Always rewrite TTL to fixed value.
Always resolve fragment ambiguities in a particular way.
Limitations
- Can't know application behavior
- EG: TCP urgent pointer
  - Some system calls may not return "urgent" data without a special command.
  - So, one way you might read "robot", another way may read "root".
  - The IDS/normalizer can't know how the socket is being used, so it can't tell about these things.

On the bright side, traffic normalizer fails closed, which is good for preventing attacks, but bad if the attacker wants to cut off internet access.

Summary:

NIDS performs stateful deep packet inspection
Difficult - must obtain same view of traffic as the destination.
When IDS view diverges from hosts, problems happen
Normalizers can help keep these views synchronized

11 2016-01-26 Tuesday

Today: IPSec, Firewalls. Next time: Intrusion Detection Systems. Next Tuesday: my paper discussion. Also, be thinking about project ideas (Thu-Tue after my presentation).

11.1 Virtual Private Networks

Institution would like to have completely separate, private network for security.
But it's expensive to geographically link multiple far apart branches, and it's nearly impossible (without a really big retractable cable) to connect roaming employees securely.
So, VPNs provide encrypted transfer across the internet, making things look like you're connected to the same network.

Can use many technologies to create VPN:

IPsec
TLS
DTLS (datagram TLS = TLS for UDP)

IPsec

data integrity
origin authentication
replay attack prevention
confidentiality
blanket coverage (encrypts all types of packets)
can use it in many different ways
- one is at the border routers to tunnel to other places

Security Associations (SA)

state information (including keys) for one direction of communication over IPsec. Contains:
- IP addresses of both sides of communication
- enryption algorithm and key
- type of integrity check used along with authentication key (for signing)
Security Association Database (SAD) is used to store that state information
Security Policy Database (SPD) is used to determine the rules for which data grams are "subjected"" to IPsec, and which ones can just go on to the "secure" side of a IPsec gateway.

IPsec packet structure (went over in class, wasn't really good for writing down).

Summary of IPsec services: a man in the middle

can't see the original contents of a datagram
can't flip bits without detection
can't masquerade as either party by using their IP address
can't replay a datagram
can see the source and destination IP address

Key Exchange?? IKE = Internet Key Exchange

First, Diffie-Helmann a secure channel between two people
Then, do authentication (could be through a variety of ways)
Finally, create two SA's.

11.2 Firewalls

Firewall: Isolates organization's internal network from larger Internet, allowing some packets to pass, blocking others.

Three types:

Statetless
Stateful
Application gateway

Stateless packet filtering:

Only makes decision based on info within the packet
- source/destination IP address
- IP protocol
- source/destination port (for UDP and TCP)
- other fields from headers
Examples:
- block by port numbers to block services like Telnet
- block TCP segments with ACK=0, to prevent external hosts from opening TCP connections with internal hosts
- block incoming/outgoing DNS except from approved DNS resolver

Stateful

EG, TCP connection tracking (yo…. PyWall!!!)

Application Gateways

Filters packets by application data as well as their header fields
for example, a web filter

12 2016-01-21 Thursday

Points from "SSL Landscape - A Thorough Analysis of the X.509 PKI Using Active and Passive Measurements".

(Note I would like to discuss at end of class: revocation lists - why aren't they signed (to address Google's concern about revocation lists)).

Read the paper. The points were from there.

13 2016-01-19 Tuesday

Defense for DNS

Close resolvers to outside queries!
- But, WiFi routers frequently offer a "trap door" to indirectly query resolvers even when they're closed.
- Some need to be open (Google public DNS)
Ultimate solution to DNS security: DNSSEC

13.1 DNSSEC

Goals

Origin Authentication: make sure that a response really came from the authoritative "zone"
Integrity: make sure that the response hasn't been changed

Main Ideas

Each zone gets a public/private key.
Public key goes into a DNSKEY record.
Private key is used to sign records, and since records are typically static, ideally the private key will be kept offline.
Signature is stored in RRSIG record.

Authenticity of DNSKEY

Get your key signed by your parent's ADNS.
- When you get authority for your zone, you get your key signed.
- Your parent ADNS keeps NS record for your ADNS.
- They also keep a DS (Delegation Signature) record to
Somewhat similar to a CA hierarchy.

Non-Existing Name

When you query for a non-existing name, you return a NXDOMAIN record.
How to trust that, when a man in the middle could simply capture a record signed once and feed it back to queries for each query?
You would have to sign each NXDOMAIN query individually, which would require an online private key.
Instead, we use an NSEC record, which specifies a "bracket" between subdomains (e.g. "a.case.edu" through "d.case.edu"), and says that there are no valid domains in between them.
All signatures expire (so that this will become invalid after a while).
So, an invalid query would return NXDOMAIN, plus the correct bracket around the invalid domain.
Sadly, this introduces the problem that now you can easily enumerate all DNS names in a zone.
Instead, we can hash the valid names, and order those. We use those brackets instead! NSEC3 records!

Chain of Trust

We now have two different trust hierarchies: DNSSEC, and X.509 (that is, PKI, or the SSL certificate hierarchy). (There is actually a third hierarchy for router advertisements, based on the hierarchy of ISP's, etc).

Problems

Although DNSSEC solves authentication and integrity, it makes DNS reflection attacks worse.
- This is because it makes responses even bigger, providing a bigger amplification for reflection attacks.
Responses are too big for UDP.
As a result, deployment is very low.
- This may change if we are able to prevent forged source IP addresses.
- Or, if a high-profile DNS poisoning attack occurs (e.g. Google).

13.2 X.509 Overview

Correction of TLS handshake:

The 5th exchange of the TLS handshake presented last time indicated that the server sends back a master key encrypted with the client's public key.
However, this would allow a hijacking!
Instead, each side generates some random bits, and they are exchanged in the first two exchanges. Then, the client sends some bits to the server encrypted with the server's public key.
Both sides compute the master key from those bits.

Holz, Braun, Kammenhuber, Carle: SSL Landscape (IMC'11)

Analysis of certificates in the wild.
Datasets:
- Top million Alexa sites. (from 9 vantage points)
- Third-party scan of the whole internet.
- Passive monitoring of a research institution's 10Gb link.

Weaknesses of X.509

Any compromised CA can sign a certificate for any domain.
- DigiNotar attack (2011)
A compromised EE (end entity) can have an attacker masquerade as it.
If browser doesn't keep up to date with revocation, etc, or if it doesn't validate at all, they can connect to fraudulent sites.
If users ignore warnings, they can get screwed too.

14 2016-01-14 Thursday

14.1 Encryption: Message Authentication Problem

As we saw last time, if I want to send an encrypted message to Misha, I can probably easily find a private key associated with him. However, I have no way of being certain that the private key I find belongs to him. Someone could have maliciously crafted that key so that they could intercept my encrypted messages to him.

One solution to this is to enlist a third-party to help. If we all trust this third party, then I can send my public key to them, have them sign it (encrypt the hash with their private key), and then include that signature with my key when I send it to Misha. Misha could verify the signature, and if he trusts the third party, he can be certain that the certificate belongs to me.

This third party is called a Root Certificate Authority. These third parties can, as they sign a public key, delegate the CA power to other entities, giving them the power to sign keys for themselves. This creates a trust hierarchy. We also have revocation lists for certificates that should no longer be trusted.

14.2 HTTPS

HTTP:

No privacy
No authentication
No message integrity
Vulnerable to hijacking

Need a way to improve this situation. Enter HTTPS! HTTPS = HTTP over TLS. HTTPS Connection startup:

Client → Server: opens connection with Server
Server → Client: responds with signed certificate and session ID
Client → Server: sends some bits (encrypted with Server's public key), also sends their public key
Server → Client: adds their own bits, generates a random master key, and sends that back, encrypted with client public key
Client → Server: encrypts with the master key a hash of all messages sent previously, guaranteeing they were the one the server was talking to all along.

HTTP Benefits;

Authentication - somewhat. If we can trust our third parties in the certificate authority hierarchy, we can generally trust for authentication.
Privacy - encrypted communication means people can't listen in after the initial negotiation

14.3 DNS

Negative Response Rewriting

When you do a DNS query with your local DNS resolver (typically operated by an ISP), and the result is an error (failure), the resolver may rewrite the error message with their own message.
ISP responds with a record pointing to a "buy this domain?" page or a "search" page for you. How helpful! :(
My ISP (time warner cable) does this >:(

Search Engine Hijacking

ISP redirects www.google.com to their own servers, which communicate with google for searches, but modify the results, advertise, etc.
Not widespread anymore, because ISPs were caught doing it.

Kaminsky Attack

What if DNS resolver is honest? Well, it can still be subverted.
An attacker can send queries to DNS servers for random subdomains of existing domains.
Then, it can spoof the Authoritative DNS server of that domain, and send back a response that includes a NS record assigning the new nameserver to be the attacker's.
If the resolver accepts the fake NS record, its cache is polluted, and everyone who uses that resolver gets records resolved by a fake ADNS server.
Kaminsky showed that, even with 16-bit transaction identifiers, it is only a matter of minutes before the attacker can correctly guess the txid and poison a cache. Which is not a bad cost at all if you can poison a cache for a long time.
However, if DNS resolver uses ephemeral ports for receiving the response, that adds 16 bits more entropy, which makes the attach pretty much impossible.

Preplay attack

DNS resolvers on home routers can be pretty dumb.
Some don't check for the IP address of the RDNS, or the transaction ID, or the destination port!
So, the attacker can simply send a request and then an answer, no guessing required, and poison that household's DNS cache.

You can also use the WiFi DNS forwarders to conduct Kaminsky attacks on ISP DNS servers that don't accept external queries. So that's a downer. The number of open forwarders has been declining (was 30 million, now is 17 million), but still isn't great.

15 2016-01-12 Tuesday

15.1 Logistics

Goals:

Many subfields of networking are fleeting, but a few like Security and Measurements are long-term. So the topic of Security is a prudent one to study.
We will prepare for research in this area by reading papers, presenting them, and undertaking a research project.
This is not computer security:
- We will not study buffer overflows, Sql injection, etc. Nothing that allows someone to take over a computer.
- Instead, we will cover attacks that don't require compromising the computer, just the network.

I'm sure topics will be available online, and we'll get to them soon enough.

Tasks:

Read papers, write summaries (which "isn't as easy as you think").
Present a paper and lead the discussion on it.
Final exam = project + presentation

Paper Summaries:

Topic
- 1-2 sentence summary of the topic.
- Is it a "mechanisms" or "characterizations" paper?
  - Mechanisms: proposes some new technique
  - Characterizations: measures some property of the internet
  - Sometimes it's both
Key Ideas/Observations
- 1-2 most important ideas
- For mechanisms, the ideas behind the mechanisms
- For characterizations, the key observations.
Flaws (1 or 2)
Open Questions (1 or 2)
- Where can you take the research from here?
My Impressions
- 1-2 sentences with your impression (is it good or not, why?)

Your summaries must follow these sections exactly. Some may be absent, but everything must have a section. Due before class so we can discuss (submitted online via email in plain text). Note that this is mostly a summary! Your opinions on the paper only belong in the small "My Impressions" section.

Some questions to ask about the paper, to identify flaws.

Can someone else replicate the results using the paper?
Are the assumptions clear enough? (Careful, we're not assessing, just learning). Do the assumptions potentially bias the paper?
Does the research eliminate alternative explanations of results?
Do conclusions follow logically from evidence.

Dos and don'ts in paper summaries:

Not a review.
Inappropriate characterizations are "comprehensive", or "insufficient references".
This is what you need to remember 10 years from now.
Value conciseness.
Value specificity: focus on "how", not "what"

Discussion guidelines:

Lecture for about 30-40 minutes.
- Allocate 2 minutes per slide. You expect they will be quicker, but usually your explanations.
Discussion for 30-40 minutes
Wrap up (5 mins)

Grading:

20% paper presentation.
40% paper summaries
10% class participation (attendance + discussions)
30% final

Office hours 2:40-3:40 Thursday, or by appointment. Prefix email subject lines with "eecs600".

15.2 Encryption and Authentication

This content will be a brief overview of the basics. Just enough for our subsequent discussions on security.

Two varieties of encryption: symmetric and asymmetric

15.2.1 Symmetric encryption:

Two parties share a secret.
Encryption/decryption algorithms scramble/unscramble plaintext using the key.
Typically very fast, and the key is very small.
However, it is very difficult to distribute keys securely.

Diffie-Hellman's protocol:

Alice and Bob agree to use a prime number p and a base g (which have some properties we won't discuss).
Alice chooses secret integer a, then sends Bob \(A = g^a \mod p\). It turns out that it's very difficult to guess a from A.
Bob chooses secret integer b, then sends Alice \(B = g^b \mod p\).
Alice computes \(s = B^a \mod p = g^{ba} \mod p\)
Bob computes \(s = A^b \mod p = g^{ab} \mod p\).
Alice and Bob now share the secret \(s\).

Invented in 1976!!

15.2.2 Asymmetric encryption

Pair of keys, (S,P). You encrypt with one and decrypt with the other.
With asymmetric encryption, the key exchange is very similar. We make one of our keys public, and we simply encrypt messages to others using their public keys.

Two types of asymmetric key types:

RSA
- Large (1-4 kbits)
- Efficient
Elliptic curve
- Small (about 160 bits)
- Expensive

15.2.3 Uses of Encryption

Message Encryption - make it so that only owners of a key can read it.
Message Signature
- Message may be distributed in the clear.
- Only the key owner could have signed it.
- Verifies authenticity and that the message has not been modified.

Authentication (signing) using symmetric keys.

Append the key to the message.
Hash that to get a "message authentication code", and append that to your original message, and send it.
Recipient can duplicate your actions and verify the message.

Message Authentication using asymmetric keys

Hash the message, and encrypt it with your private key. Include that as the signature.
Recipient can decrypt your signature using the public key, hash the message, and compare the two hashes.

Interesting side note: the signature will necessarily be the same size as the asymmetric key, which is inconvenient for large keys.

Sadly, with asymmetric keys, it's difficult to be certain that a key belongs to somebody (and isn't being spoofed). So you need to get keys certified by a trusted third party.