Tuesday, October 9, 2018

BH18: Why so Spurious? How a Highly Error-Prone x86/x64 CPU "Feature" can be Abused to Achieve Local Privilege Escalation on Many Operating Systems

Nemanja Mulasmajic  and Nicolas Peterson are Anti-Cheat Engineers at Riot Games.

This is about a hardware feature available in Intel and ARM chips. The “feature” can be abused to achieve local privilege escalation.

CVE-2018-8897 – this is a local priv escalation – read and write kernel memory from usermode. Execute usermode code with kernel privileges. Affected Windows, Linux, MacOS, FreBSD and some Xen configurations.

To fully understand this, you’ll need to have some good assembly knowledge and privilege models. In the standard model, Ring 1 and 2 are really never used, just Ring 3 (least privileged) to Ring 0 (most) (it is a simplified view).

Hardware breakpoints cannot typically be sent by userland, though there are often ways to do it in syscalls. When an interrupt fires, it transfers execution to an interrupt handler. Lookup is based off of the interrupt descriptor table (IDT), which is registered by the OS.

Segmentation is a vestigial part of the x86 architecture now that everything leverages paging.  You can still set arbitrary base addresses.  The first 2 bits describe if you’re in kernel or user mode. Depending on the mode of execution, the GS base means different things (it holds data structures relevant to the mode of execution). If we’re coming from user mode, we need to call SWAPGS to update to the equiv in kernel mode.

MOV SS and POP SS force the processor to disable external interrupts, NMIs and pending debug exceptions until the boundary of the instruction following the SS load was reached. The intended purpose was to prevent an interrupt from firing immediately after loading SS but before loading a stack pointer.

It was discovered while building a VM detection mechanism, as VMs were being used to attack Anti-Cheat.. They thought – what if VMEXIT occurs during a  “blocking” period? Let’s follow the CPUID… They started thinking about what would happen if they did interrupts at unexpected times.
So, what happens? Why did his machine crash?  Before KiBreakpointTrap executes its first instructions, the pending #DB is fired (which was suppressed by MOV SS) and execution redirects to where KiBreakpointTrap, which sends execution back to where it *thought* it should go – kernel (though it had come from user mode).

Code can be found at github.com/nmulasmajic, if you aren’t passed, system will crash.  Showed demo of 2 lines of assembly code putting a VM into a deadlock.

They can avoid SWAPGS since Windows thinks they are coming from kernelmode.  WRGSBASE writes to the GSBASE address, so use that!

They fired a #DB exception at unexpected location, and then the kernel becomes confused. Handler thinks they are privileged, now they control GSBASE.  Now they just need to find instructions to capitalize on this…

Erroneously assumed there was no encoding for MOV SS, [RAX] only immediate. It doesn’t dereference memory, but POP SS does dereference stack memory. BUT… POP SS is only valid in 32-bit compatibility code segment. On Intel chips, SYSCALL cannot be used in compatibility mode. So… focusing on using INT # only.

With the goal of writing memory, found that if they caused a page fault (KiPageFualt) from kernelmode, they c ould call KeBugCHeckEx again.  This function dereferences GSBASE memory, which is under their control…

It clobbers surrounding memory. Had to make one CPU “stuck” to deal with writing to target location. Chose CPU1 since CPU0 had to service other incoming interrupts from APIC. CPU1 endlessly page faults, goes to the double fault handler when it runs out of stack space.

The goal was to load an unsigned driver. CPU0 does the driver loading. They attempted to send TLB shootdowns, forcing CPU0 to wait on the other CPUs by checking PacketBaerrier variable in its _KPCR. But, CPU1 is in a dead spin… will never respond. But, “luckily” there was a pointer leak in the +KPCR for any CPU, accessible from usermode. (the exploit does require a minimum of 2 CPUS).

It is complicated, and it took the researchers more than a month to make it work. So, they looked into the syscall handler – KiSystemCall64. They registered in the IA32_LSTAR MSR. SYSCALL, unlike INT #, will not immediately swap to kernel – actually made things easier. (Syscall funcions similar to Int 3)

Another cool demo J

A lot of this was patched in May. MS was very quick to respond, and most OSes should be patched by now. You can’t abuse SYSCALL anymore.

Lessons learned – want to make money on bug bounty? You need a cool name and a good graphic for your vuln (pay a designer!), and don’t forget a good soundtrack!

BH18: How I Learned to Stop Worrying and Love the SBOM

Allan Friedman  | Director of Cybersecurity, NTIA / US Department of Commerce

Vendors need to understand what they are shipping to the customer, need to understand the risks in what is going out the door. You cannot defend what you don’t know. Think about ingredients list on a box – if you know you have an allergy, you can simply check the ingredients and make a decision. Why should software/hardware we ship be any different?

There had been a bill before congress, requesting that there always be an SBOM (SW Bill of Materials) for anything the US Government buys – so they know what they are getting and how to take care of it. The bill was DoA, but things are changing…

The Healthcare Sector has started getting behind that. Now people in FDA and Washington are concerned about the supply chain. There should not be health care way of doing this, automotive way of doing this, DoD way of doing this… there should be one way.   That’s where the US Department 
of Commerce comes in.  We don’t want this coming from a single sector.

Committees are the best way to do this – they are consensus based. That means it is stakeholder driven, no single person can derail. Think about it like “I push, but I don’t steer”.

We need Software Component Transparency. We need to compile the data, share it and use it.  Committee kicked off on July 19 in DC. Some folks believe this is a solved problem, but how do we make sure the existing data is machine readable? We can’t just say ‘use grep’. Ideally it could hook into tools we are already using.

First working group is tackling defining the problem. Another is working on case studies and state of practice. Others on standards and formats, healthcare proof of concept, and others.

We need more people to understand and poke at the idea of software transparency – it has real potential to improve resiliency across different sectors.

BH18: Keynote! Parisa Tabriz

Jeff Moss, founder of Blackhat, started out the first session at the top of the conference, noting several countries have only one person from their country here – Angola, Guadalupe, Greece, and several others. About half of the world’s countries are represented here this year! Blackhat continues to offer scholarships to encourage a younger audience to attend, who may not be able to afford to. Over 200 scholarships were awarded this year!

To Jeff, it feels like the adversaries have strategies, and we have tactics – that’s creating a gap. Think about address spoofing – it’s allowed and turned on on popular mobile devices by default, though most consumers don’t know what it is and why they should turn it off.

With Adobe Flash going away, beliefs out there are this will increase SPAM and change that landscape. We need to think about that.

Parisa Tabriz, Director of Engineering, Google.
Parisa has worked as a pen tester, engineer and more recently as a manager. She has often felt she was playing a game of “whack-a-mole” – how do we get away from this? Where the same vuln (or a trivial variation of another vuln) pops up over and over. We have to be more strategic in our defense.
Blockchain is not going to solve our security problems. (no matter what the vendors in the expo tell you…)

It is up to us to fix these issues. We can make great strides here – but we have to realize our current approach is insufficient

We have to tackle the root cause, pick milestones and celebrate and build out your coalition.  We need to invest in bold programs – building that coalition with people outside of the security landscape.

We cannot be satisfied with just fixing vulnerabilities. We need to explore the cause and effect – what causes these issues.

Imagine a remote code execution (RCE) is found in your code – yes, fix it, but figure out why it was introduced (the 5 Whys)

Google has started Project Zero – Make 0-Day Hard. Project Zero was formed in 2014, treats Google products like 3rd party. Finding thousands of vulnerabilities. But they want to achieve the most defensive impact from any vulnerabilities they find.

Team found that vendor response varied wildly in the industry – and it never really aligned with consumer needs. There is a power imbalance between security researcher and the big companies making the software. Project Zero has set a 90 day release time line, which has removed the negotiation between a researcher and the big company. A deadline driven approach causes pain for the larger organizations that need to make big changes – but it is leading to positive change at these companies. They are rallying and making the necessary fixes internally.

One vendor improved their patch response time by as much as 40%! 98% of the issues are fixed within the 90-day disclosure period – a huge change!  Unsure what all of those changes are, but guessing it’s improved processes, creating security response teams, etc.

If you care about end user security, you need to be more open. More transparency in Project Zero has allowed for more collaboration.

We all need to increase collaboration – but this is hard with corporate legal, process and policies. It’s important that we work to change this culture.

The defenders are our unsung heroes – they don’t win awards, often are not even recognized at their office. If they do their job well, nobody notices.

We lose steam in distraction driven work environments. We have to project manage, and keep driving towards this goal.

We need to change the status quo – if you’re not upsetting anyone, then you’re not going to change the status quo.

One project Google is doing to change the world is to move people away from HTTP and to HTTPS on the web platform.  Not just Google services, but the entire world wide web.  We wanted to see a web that was by default secure – not opt-in secure. The old Chrome browser didn’t make this as obvious to users which was the better website – something to work on.

Browser standards come from many standards bodies, like IETF, W3C, ISO, etc – and then people build browsers on top of those using their own designs. Going to HTTPS is not as simple as flipping a switch – need to worry about getting certificates, performance, managing the security, etc.

Did not want to create warning fatigue, or to have it be inconsistently reported (that is, a site reported as insecure on Chrome, but secure on another browser).

Needed to roll out these changes gradually, with specific milestones we could celebrate. Started with a TLSHaiku poetry competition, which led to brainstorming.  Shared ideas publicly, got feedback from all over, and helped to build support internally at Google to drive this. Published a paper on how to best warn users.  Published papers regarding who was and was not using HTTPS. 

Started a grass root effort to help people migrate to HTTPS. Celebrated big conversions publicly, recognizing good actors.  Vendors were given a deadline to transition to, with clear milestones to work against, and could move forward. Had to work with certificate vendors to make it easier and cheaper to get certificates.

Team ate homemade HTTPS cake and pie! It is important to celebrate accomplishments, acknowledge the difficult work done. People need purpose – it will drive and unify them.

Chrome set out with an architecture that would protect a malicious site from attacking your physical machine. But, now with lots of data out there in the cloud, has grown the cross site data attacks.  Google’s Chrome team started the Site Isolation project in 2012 that prevented the data from moving that way.

We need to continue to invest in ambitious proactive defensive projects.

Projects can fail for a variety of reasons – management can kill the project, for example.  The site isolation project was originally estimated to be a year, but it actually took six….. schedule delay at that level puts a bulls-eye on you.  Another issue could be lack of peer support – be a good team player and don’t be a jerk!

Thursday, August 9, 2018

BH18: Lowering the Bar: Deep Learning for Side Channel Analysis

Jasper van Woudenberg, Riscure

The old way of doing side channel analysis was to do leakage modeling to pull out keys from the signals. Started researching what happens if they use a neural network for the analysis.

They still need to attach the scopes and wires to the device, can't get robots to do that, yet. They do several runs and look for variations in signal/power usage to find leakages from the patterns (and divergence of the patterns).

Then we got a demo of some signal analysis - he made a mistake, and noted that is the problems with humans, we make mistakes.

Understanding the power consumption can give you the results of X (X-or of Input and Key), then if we know input - we can get the key! Still a lot of work to do.

In template analysis, you build models around various devices from power traces - then look for other devices using the same chipset, and then can start gathering input for analysis.

The researchers than looked at improving their processes with Convolutional Neural Networks (CNNS). THere is the input layer (size is equal to number of samples), the convolutional layer (feature extractor + encoding), then Dense Layers (classifiers) and finally the output later. Convolutional layers are able to detect the features independently of their positions.

There are a lot of visuals and live tracing, hard to capture here, but fascinating to watch :-)

Caveat - don't give too much input, make the network is too big = or the model cannot actually learn and will not be able classify new things.  (memorizes vs learning).  Need to verify this with validation recall. 

Deep learning can really help with side channel analysis and it scales well. It does require network fiddling, but it's not that hard. This automation will help put a dent into better securing embedded devices.


BH18: Legal Liability for IoT Cybersecurity Vulnerabilities

IJay Palansky, Partner, Armstrong Teasdale

IJay is not a cyber security expert, but he is a trial lawyer who handles complex commercial litigation, consumer protection,  and class actions - usually representing the defendant.

There is a difference between data breach and IoT vulns. They aren't handled the same. There is precedent on data breaches, but not really much on IoT devices. People have been radically underestimating the cost and volume of IoT lawsuits that are about to come. The conditions are going to be right for a wave of lawsuits.

Think about policy. The rules are changing. It is hard to predict how this will play out, so it's hard to say how IoT companies should protect themselves. IJay likes this quote from Jeff Motz - "What would make 'defense greater than offense'..?" (Motz? maybe Moss?)

People are trying to get the latest and greatest gadget out, to get the first to market advantage. Security slows this down. But if your'e not thinking about security devices up front, you are putting yourself at risk. If you get drawn into litigation or the media draws attention to it, you need to be able to answer to the media (or a judge) what you did to meet basic security requirements for that type of device. Think of avoiding the liability. Judges will look for who is the msot responsible.

It's estimated that there will be 20 Billion connected devices by 2020.

There are ridiculous items coming online all the time - like the water bottle that glows when you need to drink, the connected Moen shower to set temperature, and the worst the i.Con Smart Condom... oh boy.

These devices have potential to harm, from privacy issues to physical harm.  There can be ransomware, DDoS attacks, etc. These are reality - people are remotely hacking vehicles already.

Plaintiffs' lawyers are watching and wating, they want to make sure they can get soemthing out of it financially. They need to be able to prove harm and attribution (who to blame). Most importantly, the plaintiffs' lawyers don't understand this technology (and neither do the judges), or how the laws here work.

There is general agreement that the security of IoT devices is not where they should be. There will be lawsuits, once there are some, there will be more (those other attorneys will be watching).

This is not the first time that product liability or other law has had to address new technology, but the interconnectedness involved in IoT is unique. They need to show who's fault it was - could get multiple defendants, and they will be so busy showing what the other defendant did wrong - doing the plaintiffs' lawyer's job for them. :-)

There has been some enforcement by regulators, like the response to TRENDnet Webcam hack in Jan 2012, which resulted in a settlement in 2013.

Some lawyers will be looking for opportunities to take up these cases, to help build a name and reputation.

The Jeep hack was announced in 2015, then Chrysler recalled the vehicles. That's not where the story ends... there is a class action lawsuit moving forward still. (filed in 2016, but only approved yesterday to go forward). This is where things get interesting - nobody was hurt, but there was real potential of getting hurt.   People thought they were buying a safe car, and they were not. What is the value?

There is reputation loss, safety issues, and the cost of litigation that makes this all a problem. It's a burden and distraction on key employees that have to be deposed, find documents, etc.

The engineers and experts get stressed about saying something that will hurt their company, or thinking that they did something wrong that hurt someone. That is a cost.

IJay then walks us through law school in 10 minutes :-)

You need to understand the legal risks and assocaited costs, so when you are making decisions on the right level of security.

Damages vary by legal claim and the particular harm. Claims can be around things like negligence, fraud or fradulent omission, breach of warranty, strict product liability.  These are all state law claims, not federal, which means there will be variance.

Negligence means you have failed tot take "reasonable care" - often based on expert opinions.  Think of the Pinto - they had design defects.

Design defets could be around hardware or software, things like how passwords are handled.

Breach of warranty is an issue as well - there are implied warranties, like of merchantability (assumption product is safe and usable)  If you know you have an issue, and don't tell anyone - that's fraudulent omission.

Keep in mind that state statutes are dsigned to be cosnumer friendly, with really broad defintiions.

You need to minimally follow industry standards, but that may not be sufficient.

Think about security at all stages of your design, be informed and ask the right questions, be paranoid and allocate risk. Test and document the testing you did, save it while you do the work. It will hep protect you.  Be careful about words you use around your products, watch what you say in your advertisement and don't overstate what you do.

You should also get litigation insurance and make sure it covers IoT.

If it goes wrong - get a  good lawyer who knows this area. Investigate the cause, inclding discussions with engineers.

A wave of IoT hack and vuln litigation is coming - you need to be thinking about this now. Understand and use sound cybersecurity design and engineering principles.

BH18: WebAssembly: A New World of Native Exploits on the Browser

Justin Engler, Technical Director, NCC Group
Tyler Lukasiewicz, Security Consultant, NCC Group

WASM (WebAssembly) allows you to take code written elsewhere and run it in a browser.

Crypto minors and archive.org alike are starting to use web assembly.

Browsix is a project to implement POSIX interfaces in the browswer, and JsLinux has an entire OS in the browser. eWASM is a solution for ethereum contracts (an alternative to solidity). (and a bunch of other cool things)>

Remember when... Java Applets used to claim the same things (sandboxing, virtualization, code in browser)...

WebAssembly is a relatively small set of low-level instructions that are executed by browsers. It's a stack machine. You can push and pop things off the stack (to me the code looks a lot like lisp).  We do a couple of walkthroughs of sample code - they created a table of function pointers (egads! it's like networking kernel programming).

WASM in the browser - it can't do anything on it's own (can't read memory, write to screen, etc). If you want it to do anything, you need to import/export memory/functionality/etc. Memory can be shared across instances of Wasm.

Emscripten will help you create .wasm binaries rom other C/C++ code, incldues buit-in C libraries, etc.  Can also connect you to Java and JavaScript.

Old exploits in C work in WASM, like format strings and integer overflows. WASM has it's own integer types, different from C, different than JavaScript. You need to be careful sending integers across boundaries (overflow)..  Buffer overflows are an issue as well.  If you try to go past your linear memory, you get a JS error - it doesn't work well, it's pretty ugly.

You can now go from a BOF (Buffer Over Flow) to XSS. Emscripten's API allows devs to reference the DOM from C/C++. CHaracter arrays being written to the DOM create the possibilyt of DOM-based XSS and can use a user-tainted value to overwrite a safe value.  This type of attack likely won't be caught by any standard XSS scanners.  As JS has control of the WASM memory and tables, XSS should give us control of any running WASM.

And this even creates new exploits here! We can now have a function pointer overflow. Emscripten has functions that run arbitrary code (emscripten_rn_script). Can take advantage of that as lont as it's loaded. They discovered that function tables are constant - across compilations and even on different machines.

You don't necessarily to go after the XSS here, but could use functions written by the developers as long as it has the same signature as the real one.

They also showed a service-side RCE (Remote Code Execution). Showed code in browser starting a process on the server.

Many mitigations from C/C++ won't work on WASM. THey could use things like ASLR and could use some library hardening. Effective mitigations include control flow integrity and function definitions and indexing (prevents ROP-style gadgets).

WASM did cover these in their security warning, in a buried paragraph. It should be mroe obvious.

If you can avoid emscripten_run_script and friends, run the optimizer (removes automatically included functions that might have een useful for control flow attacks), use control flow integrity (but it may be slower) and you still have to fix your C bugs!

There is whitepaper out - Security CHasms of WASM

BH18: AI & ML in Cyber Security - Why Algorithms are Dangerous

Raffael Marty, VP Corporate Strategy ForcePoint

We don't truly have AI, yet. Algorithms are getting smarter, but experts are more important. Understand your data and algorithms before you do anything with them. It's important to invest in experts that know security.

Raffael has been doing this (working in security) for a very long time, and then moved into big data. At Forcepoint, he's focusing on studying user behavior so that they can recognize when something bad is happening. ("The Human Point System")

Machine learning is an algorithmic way to describe data. In supervised case, we are giving the system a lot of training data. Unsupervised, we give the system an optimization for it to solve.  For "Deep Learning" - it is a newer machine learning algorithm. It eliminates the feature engineering step.  Data mining is a set of methods to explore data automatically.  And AI - "A program that doesn't simply classify or compute model parameters, but comes up with novel knowledge that a security analyst finds insightful" (not there, yet).

Computers are now better than people at playing chess and Go, they are even getting better at designing effective drugs and for making things like Siri smarter.

Machine learning is used in security, for things like detecting malware, spam detection, and finding pockets of bad IP addresses on the Internet in supervised cases, and more in unsupervised..

There are several examples of AI failures in the field, like the Pentagon training AI to learn tanks (they used sunny pictures for "no tank" and cloudy with tanks, so the AI system assumed no tanks were in sunny weather... ooops!)

Algorithms make assumptions about the data, they assume the data is clean (often is not), make assumptions about distribution of data and don't deal with outliers.  The algorithms are too easy to use today - the process is more important than the algorithm.  Algorithms do not take domain knowledge into account.  Defining meaningful and representative distance functions, for example.  Ports look like integers and algorithms make bad assumptions here about "distance"

There is bias in the algorithms we are not aware of (example of translating "he is a nurse. she is a doctor" from English to Hungarian and back again... suddenly the genders are swapped! Now she is a nurse....)

Too often assumptions are made based on a single customer's data, or learning from an infected data set, or simply missing data.  Another example is an IDS that got confused by IKE traffic and classified it as a "UDP Bomb".

There are dangers with deep learning use. Do not use if there is not enough or no quality labelled data, look out for things like time zones along with timezones. You need to have well trained domain experts and data scientists to oversee the implementation, and understand what was actually learned.
Note - there are not a lot of individuals that understand security and data science, so make sure you build then a good, strong and cohesive team.

You need to look out for adversarial input - you can add a small amount of noise to an image, for example, that a human cannot see, but can trick a computer into thinking a picture of a panda is really a gibbon.

Deep learning - is it the solution to everything? Most security problems cannot be solved with deep learning (or supervised methods in general). We looked at a network graph - we might have lots of data, but not enough information or context nor labels - the dataset is actually no good.

Can unsupervised data save us?  Can we exploit the inherent structure within the adta to find anomalies and attacks?  First we have to clean the data, engineer distance functions, analyze the data, etc...

In one graphic, a destination port was misclassified as a source port (80!), and one bit of data had port 70000!  While it's obvious to those of us with network knowledge that the data is messed up, it's not to the data scientists that looked at the data. (with this network data, the data scientists found "attacks" at port 0).

Data science might classify port 443 as an "outlier" because it's "far" from port 80 - but to those of us who know, they are not "far" from each other technically.

Different algorithms struggle with clustered data, the shape of the data.  Even if you choose the "right" algorithm, you must understand the parameters

If you get all of those things right, then you still need to interpret the data. Are the clusters good or bad? What is anomalous?

There is another approach - probabilistic inference. Look at a Beysian Belief Networks. The first step is to build the graph, thinking about the objective and the observable behaviors. If the data is too complicated, may need to introduce "grouping nodes" and introduce the dependencies between the groups. After all the right steps, you still need to get expert opinions.

Need to make sure you start with defining your use-cases, but by choosing an algorithm. ML is barely ever the solution to your problem. Use ensembles of algorithms and teach the algos to ask for input!  You want it to have expert input and not make assumptions!

Remember - "History is not a predictor, but knowledge is"