Tuesday, October 9, 2018

BH18: Why so Spurious? How a Highly Error-Prone x86/x64 CPU "Feature" can be Abused to Achieve Local Privilege Escalation on Many Operating Systems

Nemanja Mulasmajic  and Nicolas Peterson are Anti-Cheat Engineers at Riot Games.

This is about a hardware feature available in Intel and ARM chips. The “feature” can be abused to achieve local privilege escalation.

CVE-2018-8897 – this is a local priv escalation – read and write kernel memory from usermode. Execute usermode code with kernel privileges. Affected Windows, Linux, MacOS, FreBSD and some Xen configurations.

To fully understand this, you’ll need to have some good assembly knowledge and privilege models. In the standard model, Ring 1 and 2 are really never used, just Ring 3 (least privileged) to Ring 0 (most) (it is a simplified view).

Hardware breakpoints cannot typically be sent by userland, though there are often ways to do it in syscalls. When an interrupt fires, it transfers execution to an interrupt handler. Lookup is based off of the interrupt descriptor table (IDT), which is registered by the OS.

Segmentation is a vestigial part of the x86 architecture now that everything leverages paging.  You can still set arbitrary base addresses.  The first 2 bits describe if you’re in kernel or user mode. Depending on the mode of execution, the GS base means different things (it holds data structures relevant to the mode of execution). If we’re coming from user mode, we need to call SWAPGS to update to the equiv in kernel mode.

MOV SS and POP SS force the processor to disable external interrupts, NMIs and pending debug exceptions until the boundary of the instruction following the SS load was reached. The intended purpose was to prevent an interrupt from firing immediately after loading SS but before loading a stack pointer.

It was discovered while building a VM detection mechanism, as VMs were being used to attack Anti-Cheat.. They thought – what if VMEXIT occurs during a  “blocking” period? Let’s follow the CPUID… They started thinking about what would happen if they did interrupts at unexpected times.
So, what happens? Why did his machine crash?  Before KiBreakpointTrap executes its first instructions, the pending #DB is fired (which was suppressed by MOV SS) and execution redirects to where KiBreakpointTrap, which sends execution back to where it *thought* it should go – kernel (though it had come from user mode).

Code can be found at github.com/nmulasmajic, if you aren’t passed, system will crash.  Showed demo of 2 lines of assembly code putting a VM into a deadlock.

They can avoid SWAPGS since Windows thinks they are coming from kernelmode.  WRGSBASE writes to the GSBASE address, so use that!

They fired a #DB exception at unexpected location, and then the kernel becomes confused. Handler thinks they are privileged, now they control GSBASE.  Now they just need to find instructions to capitalize on this…

Erroneously assumed there was no encoding for MOV SS, [RAX] only immediate. It doesn’t dereference memory, but POP SS does dereference stack memory. BUT… POP SS is only valid in 32-bit compatibility code segment. On Intel chips, SYSCALL cannot be used in compatibility mode. So… focusing on using INT # only.

With the goal of writing memory, found that if they caused a page fault (KiPageFualt) from kernelmode, they c ould call KeBugCHeckEx again.  This function dereferences GSBASE memory, which is under their control…

It clobbers surrounding memory. Had to make one CPU “stuck” to deal with writing to target location. Chose CPU1 since CPU0 had to service other incoming interrupts from APIC. CPU1 endlessly page faults, goes to the double fault handler when it runs out of stack space.

The goal was to load an unsigned driver. CPU0 does the driver loading. They attempted to send TLB shootdowns, forcing CPU0 to wait on the other CPUs by checking PacketBaerrier variable in its _KPCR. But, CPU1 is in a dead spin… will never respond. But, “luckily” there was a pointer leak in the +KPCR for any CPU, accessible from usermode. (the exploit does require a minimum of 2 CPUS).

It is complicated, and it took the researchers more than a month to make it work. So, they looked into the syscall handler – KiSystemCall64. They registered in the IA32_LSTAR MSR. SYSCALL, unlike INT #, will not immediately swap to kernel – actually made things easier. (Syscall funcions similar to Int 3)

Another cool demo J

A lot of this was patched in May. MS was very quick to respond, and most OSes should be patched by now. You can’t abuse SYSCALL anymore.

Lessons learned – want to make money on bug bounty? You need a cool name and a good graphic for your vuln (pay a designer!), and don’t forget a good soundtrack!

BH18: How I Learned to Stop Worrying and Love the SBOM

Allan Friedman  | Director of Cybersecurity, NTIA / US Department of Commerce

Vendors need to understand what they are shipping to the customer, need to understand the risks in what is going out the door. You cannot defend what you don’t know. Think about ingredients list on a box – if you know you have an allergy, you can simply check the ingredients and make a decision. Why should software/hardware we ship be any different?

There had been a bill before congress, requesting that there always be an SBOM (SW Bill of Materials) for anything the US Government buys – so they know what they are getting and how to take care of it. The bill was DoA, but things are changing…

The Healthcare Sector has started getting behind that. Now people in FDA and Washington are concerned about the supply chain. There should not be health care way of doing this, automotive way of doing this, DoD way of doing this… there should be one way.   That’s where the US Department 
of Commerce comes in.  We don’t want this coming from a single sector.

Committees are the best way to do this – they are consensus based. That means it is stakeholder driven, no single person can derail. Think about it like “I push, but I don’t steer”.

We need Software Component Transparency. We need to compile the data, share it and use it.  Committee kicked off on July 19 in DC. Some folks believe this is a solved problem, but how do we make sure the existing data is machine readable? We can’t just say ‘use grep’. Ideally it could hook into tools we are already using.

First working group is tackling defining the problem. Another is working on case studies and state of practice. Others on standards and formats, healthcare proof of concept, and others.

We need more people to understand and poke at the idea of software transparency – it has real potential to improve resiliency across different sectors.

BH18: Keynote! Parisa Tabriz

Jeff Moss, founder of Blackhat, started out the first session at the top of the conference, noting several countries have only one person from their country here – Angola, Guadalupe, Greece, and several others. About half of the world’s countries are represented here this year! Blackhat continues to offer scholarships to encourage a younger audience to attend, who may not be able to afford to. Over 200 scholarships were awarded this year!

To Jeff, it feels like the adversaries have strategies, and we have tactics – that’s creating a gap. Think about address spoofing – it’s allowed and turned on on popular mobile devices by default, though most consumers don’t know what it is and why they should turn it off.

With Adobe Flash going away, beliefs out there are this will increase SPAM and change that landscape. We need to think about that.

Parisa Tabriz, Director of Engineering, Google.
Parisa has worked as a pen tester, engineer and more recently as a manager. She has often felt she was playing a game of “whack-a-mole” – how do we get away from this? Where the same vuln (or a trivial variation of another vuln) pops up over and over. We have to be more strategic in our defense.
Blockchain is not going to solve our security problems. (no matter what the vendors in the expo tell you…)

It is up to us to fix these issues. We can make great strides here – but we have to realize our current approach is insufficient

We have to tackle the root cause, pick milestones and celebrate and build out your coalition.  We need to invest in bold programs – building that coalition with people outside of the security landscape.

We cannot be satisfied with just fixing vulnerabilities. We need to explore the cause and effect – what causes these issues.

Imagine a remote code execution (RCE) is found in your code – yes, fix it, but figure out why it was introduced (the 5 Whys)

Google has started Project Zero – Make 0-Day Hard. Project Zero was formed in 2014, treats Google products like 3rd party. Finding thousands of vulnerabilities. But they want to achieve the most defensive impact from any vulnerabilities they find.

Team found that vendor response varied wildly in the industry – and it never really aligned with consumer needs. There is a power imbalance between security researcher and the big companies making the software. Project Zero has set a 90 day release time line, which has removed the negotiation between a researcher and the big company. A deadline driven approach causes pain for the larger organizations that need to make big changes – but it is leading to positive change at these companies. They are rallying and making the necessary fixes internally.

One vendor improved their patch response time by as much as 40%! 98% of the issues are fixed within the 90-day disclosure period – a huge change!  Unsure what all of those changes are, but guessing it’s improved processes, creating security response teams, etc.

If you care about end user security, you need to be more open. More transparency in Project Zero has allowed for more collaboration.

We all need to increase collaboration – but this is hard with corporate legal, process and policies. It’s important that we work to change this culture.

The defenders are our unsung heroes – they don’t win awards, often are not even recognized at their office. If they do their job well, nobody notices.

We lose steam in distraction driven work environments. We have to project manage, and keep driving towards this goal.

We need to change the status quo – if you’re not upsetting anyone, then you’re not going to change the status quo.

One project Google is doing to change the world is to move people away from HTTP and to HTTPS on the web platform.  Not just Google services, but the entire world wide web.  We wanted to see a web that was by default secure – not opt-in secure. The old Chrome browser didn’t make this as obvious to users which was the better website – something to work on.

Browser standards come from many standards bodies, like IETF, W3C, ISO, etc – and then people build browsers on top of those using their own designs. Going to HTTPS is not as simple as flipping a switch – need to worry about getting certificates, performance, managing the security, etc.

Did not want to create warning fatigue, or to have it be inconsistently reported (that is, a site reported as insecure on Chrome, but secure on another browser).

Needed to roll out these changes gradually, with specific milestones we could celebrate. Started with a TLSHaiku poetry competition, which led to brainstorming.  Shared ideas publicly, got feedback from all over, and helped to build support internally at Google to drive this. Published a paper on how to best warn users.  Published papers regarding who was and was not using HTTPS. 

Started a grass root effort to help people migrate to HTTPS. Celebrated big conversions publicly, recognizing good actors.  Vendors were given a deadline to transition to, with clear milestones to work against, and could move forward. Had to work with certificate vendors to make it easier and cheaper to get certificates.

Team ate homemade HTTPS cake and pie! It is important to celebrate accomplishments, acknowledge the difficult work done. People need purpose – it will drive and unify them.

Chrome set out with an architecture that would protect a malicious site from attacking your physical machine. But, now with lots of data out there in the cloud, has grown the cross site data attacks.  Google’s Chrome team started the Site Isolation project in 2012 that prevented the data from moving that way.

We need to continue to invest in ambitious proactive defensive projects.

Projects can fail for a variety of reasons – management can kill the project, for example.  The site isolation project was originally estimated to be a year, but it actually took six….. schedule delay at that level puts a bulls-eye on you.  Another issue could be lack of peer support – be a good team player and don’t be a jerk!

Thursday, August 9, 2018

BH18: Lowering the Bar: Deep Learning for Side Channel Analysis

Jasper van Woudenberg, Riscure

The old way of doing side channel analysis was to do leakage modeling to pull out keys from the signals. Started researching what happens if they use a neural network for the analysis.

They still need to attach the scopes and wires to the device, can't get robots to do that, yet. They do several runs and look for variations in signal/power usage to find leakages from the patterns (and divergence of the patterns).

Then we got a demo of some signal analysis - he made a mistake, and noted that is the problems with humans, we make mistakes.

Understanding the power consumption can give you the results of X (X-or of Input and Key), then if we know input - we can get the key! Still a lot of work to do.

In template analysis, you build models around various devices from power traces - then look for other devices using the same chipset, and then can start gathering input for analysis.

The researchers than looked at improving their processes with Convolutional Neural Networks (CNNS). THere is the input layer (size is equal to number of samples), the convolutional layer (feature extractor + encoding), then Dense Layers (classifiers) and finally the output later. Convolutional layers are able to detect the features independently of their positions.

There are a lot of visuals and live tracing, hard to capture here, but fascinating to watch :-)

Caveat - don't give too much input, make the network is too big = or the model cannot actually learn and will not be able classify new things.  (memorizes vs learning).  Need to verify this with validation recall. 

Deep learning can really help with side channel analysis and it scales well. It does require network fiddling, but it's not that hard. This automation will help put a dent into better securing embedded devices.


BH18: Legal Liability for IoT Cybersecurity Vulnerabilities

IJay Palansky, Partner, Armstrong Teasdale

IJay is not a cyber security expert, but he is a trial lawyer who handles complex commercial litigation, consumer protection,  and class actions - usually representing the defendant.

There is a difference between data breach and IoT vulns. They aren't handled the same. There is precedent on data breaches, but not really much on IoT devices. People have been radically underestimating the cost and volume of IoT lawsuits that are about to come. The conditions are going to be right for a wave of lawsuits.

Think about policy. The rules are changing. It is hard to predict how this will play out, so it's hard to say how IoT companies should protect themselves. IJay likes this quote from Jeff Motz - "What would make 'defense greater than offense'..?" (Motz? maybe Moss?)

People are trying to get the latest and greatest gadget out, to get the first to market advantage. Security slows this down. But if your'e not thinking about security devices up front, you are putting yourself at risk. If you get drawn into litigation or the media draws attention to it, you need to be able to answer to the media (or a judge) what you did to meet basic security requirements for that type of device. Think of avoiding the liability. Judges will look for who is the msot responsible.

It's estimated that there will be 20 Billion connected devices by 2020.

There are ridiculous items coming online all the time - like the water bottle that glows when you need to drink, the connected Moen shower to set temperature, and the worst the i.Con Smart Condom... oh boy.

These devices have potential to harm, from privacy issues to physical harm.  There can be ransomware, DDoS attacks, etc. These are reality - people are remotely hacking vehicles already.

Plaintiffs' lawyers are watching and wating, they want to make sure they can get soemthing out of it financially. They need to be able to prove harm and attribution (who to blame). Most importantly, the plaintiffs' lawyers don't understand this technology (and neither do the judges), or how the laws here work.

There is general agreement that the security of IoT devices is not where they should be. There will be lawsuits, once there are some, there will be more (those other attorneys will be watching).

This is not the first time that product liability or other law has had to address new technology, but the interconnectedness involved in IoT is unique. They need to show who's fault it was - could get multiple defendants, and they will be so busy showing what the other defendant did wrong - doing the plaintiffs' lawyer's job for them. :-)

There has been some enforcement by regulators, like the response to TRENDnet Webcam hack in Jan 2012, which resulted in a settlement in 2013.

Some lawyers will be looking for opportunities to take up these cases, to help build a name and reputation.

The Jeep hack was announced in 2015, then Chrysler recalled the vehicles. That's not where the story ends... there is a class action lawsuit moving forward still. (filed in 2016, but only approved yesterday to go forward). This is where things get interesting - nobody was hurt, but there was real potential of getting hurt.   People thought they were buying a safe car, and they were not. What is the value?

There is reputation loss, safety issues, and the cost of litigation that makes this all a problem. It's a burden and distraction on key employees that have to be deposed, find documents, etc.

The engineers and experts get stressed about saying something that will hurt their company, or thinking that they did something wrong that hurt someone. That is a cost.

IJay then walks us through law school in 10 minutes :-)

You need to understand the legal risks and assocaited costs, so when you are making decisions on the right level of security.

Damages vary by legal claim and the particular harm. Claims can be around things like negligence, fraud or fradulent omission, breach of warranty, strict product liability.  These are all state law claims, not federal, which means there will be variance.

Negligence means you have failed tot take "reasonable care" - often based on expert opinions.  Think of the Pinto - they had design defects.

Design defets could be around hardware or software, things like how passwords are handled.

Breach of warranty is an issue as well - there are implied warranties, like of merchantability (assumption product is safe and usable)  If you know you have an issue, and don't tell anyone - that's fraudulent omission.

Keep in mind that state statutes are dsigned to be cosnumer friendly, with really broad defintiions.

You need to minimally follow industry standards, but that may not be sufficient.

Think about security at all stages of your design, be informed and ask the right questions, be paranoid and allocate risk. Test and document the testing you did, save it while you do the work. It will hep protect you.  Be careful about words you use around your products, watch what you say in your advertisement and don't overstate what you do.

You should also get litigation insurance and make sure it covers IoT.

If it goes wrong - get a  good lawyer who knows this area. Investigate the cause, inclding discussions with engineers.

A wave of IoT hack and vuln litigation is coming - you need to be thinking about this now. Understand and use sound cybersecurity design and engineering principles.

BH18: WebAssembly: A New World of Native Exploits on the Browser

Justin Engler, Technical Director, NCC Group
Tyler Lukasiewicz, Security Consultant, NCC Group

WASM (WebAssembly) allows you to take code written elsewhere and run it in a browser.

Crypto minors and archive.org alike are starting to use web assembly.

Browsix is a project to implement POSIX interfaces in the browswer, and JsLinux has an entire OS in the browser. eWASM is a solution for ethereum contracts (an alternative to solidity). (and a bunch of other cool things)>

Remember when... Java Applets used to claim the same things (sandboxing, virtualization, code in browser)...

WebAssembly is a relatively small set of low-level instructions that are executed by browsers. It's a stack machine. You can push and pop things off the stack (to me the code looks a lot like lisp).  We do a couple of walkthroughs of sample code - they created a table of function pointers (egads! it's like networking kernel programming).

WASM in the browser - it can't do anything on it's own (can't read memory, write to screen, etc). If you want it to do anything, you need to import/export memory/functionality/etc. Memory can be shared across instances of Wasm.

Emscripten will help you create .wasm binaries rom other C/C++ code, incldues buit-in C libraries, etc.  Can also connect you to Java and JavaScript.

Old exploits in C work in WASM, like format strings and integer overflows. WASM has it's own integer types, different from C, different than JavaScript. You need to be careful sending integers across boundaries (overflow)..  Buffer overflows are an issue as well.  If you try to go past your linear memory, you get a JS error - it doesn't work well, it's pretty ugly.

You can now go from a BOF (Buffer Over Flow) to XSS. Emscripten's API allows devs to reference the DOM from C/C++. CHaracter arrays being written to the DOM create the possibilyt of DOM-based XSS and can use a user-tainted value to overwrite a safe value.  This type of attack likely won't be caught by any standard XSS scanners.  As JS has control of the WASM memory and tables, XSS should give us control of any running WASM.

And this even creates new exploits here! We can now have a function pointer overflow. Emscripten has functions that run arbitrary code (emscripten_rn_script). Can take advantage of that as lont as it's loaded. They discovered that function tables are constant - across compilations and even on different machines.

You don't necessarily to go after the XSS here, but could use functions written by the developers as long as it has the same signature as the real one.

They also showed a service-side RCE (Remote Code Execution). Showed code in browser starting a process on the server.

Many mitigations from C/C++ won't work on WASM. THey could use things like ASLR and could use some library hardening. Effective mitigations include control flow integrity and function definitions and indexing (prevents ROP-style gadgets).

WASM did cover these in their security warning, in a buried paragraph. It should be mroe obvious.

If you can avoid emscripten_run_script and friends, run the optimizer (removes automatically included functions that might have een useful for control flow attacks), use control flow integrity (but it may be slower) and you still have to fix your C bugs!

There is whitepaper out - Security CHasms of WASM

BH18: AI & ML in Cyber Security - Why Algorithms are Dangerous

Raffael Marty, VP Corporate Strategy ForcePoint

We don't truly have AI, yet. Algorithms are getting smarter, but experts are more important. Understand your data and algorithms before you do anything with them. It's important to invest in experts that know security.

Raffael has been doing this (working in security) for a very long time, and then moved into big data. At Forcepoint, he's focusing on studying user behavior so that they can recognize when something bad is happening. ("The Human Point System")

Machine learning is an algorithmic way to describe data. In supervised case, we are giving the system a lot of training data. Unsupervised, we give the system an optimization for it to solve.  For "Deep Learning" - it is a newer machine learning algorithm. It eliminates the feature engineering step.  Data mining is a set of methods to explore data automatically.  And AI - "A program that doesn't simply classify or compute model parameters, but comes up with novel knowledge that a security analyst finds insightful" (not there, yet).

Computers are now better than people at playing chess and Go, they are even getting better at designing effective drugs and for making things like Siri smarter.

Machine learning is used in security, for things like detecting malware, spam detection, and finding pockets of bad IP addresses on the Internet in supervised cases, and more in unsupervised..

There are several examples of AI failures in the field, like the Pentagon training AI to learn tanks (they used sunny pictures for "no tank" and cloudy with tanks, so the AI system assumed no tanks were in sunny weather... ooops!)

Algorithms make assumptions about the data, they assume the data is clean (often is not), make assumptions about distribution of data and don't deal with outliers.  The algorithms are too easy to use today - the process is more important than the algorithm.  Algorithms do not take domain knowledge into account.  Defining meaningful and representative distance functions, for example.  Ports look like integers and algorithms make bad assumptions here about "distance"

There is bias in the algorithms we are not aware of (example of translating "he is a nurse. she is a doctor" from English to Hungarian and back again... suddenly the genders are swapped! Now she is a nurse....)

Too often assumptions are made based on a single customer's data, or learning from an infected data set, or simply missing data.  Another example is an IDS that got confused by IKE traffic and classified it as a "UDP Bomb".

There are dangers with deep learning use. Do not use if there is not enough or no quality labelled data, look out for things like time zones along with timezones. You need to have well trained domain experts and data scientists to oversee the implementation, and understand what was actually learned.
Note - there are not a lot of individuals that understand security and data science, so make sure you build then a good, strong and cohesive team.

You need to look out for adversarial input - you can add a small amount of noise to an image, for example, that a human cannot see, but can trick a computer into thinking a picture of a panda is really a gibbon.

Deep learning - is it the solution to everything? Most security problems cannot be solved with deep learning (or supervised methods in general). We looked at a network graph - we might have lots of data, but not enough information or context nor labels - the dataset is actually no good.

Can unsupervised data save us?  Can we exploit the inherent structure within the adta to find anomalies and attacks?  First we have to clean the data, engineer distance functions, analyze the data, etc...

In one graphic, a destination port was misclassified as a source port (80!), and one bit of data had port 70000!  While it's obvious to those of us with network knowledge that the data is messed up, it's not to the data scientists that looked at the data. (with this network data, the data scientists found "attacks" at port 0).

Data science might classify port 443 as an "outlier" because it's "far" from port 80 - but to those of us who know, they are not "far" from each other technically.

Different algorithms struggle with clustered data, the shape of the data.  Even if you choose the "right" algorithm, you must understand the parameters

If you get all of those things right, then you still need to interpret the data. Are the clusters good or bad? What is anomalous?

There is another approach - probabilistic inference. Look at a Beysian Belief Networks. The first step is to build the graph, thinking about the objective and the observable behaviors. If the data is too complicated, may need to introduce "grouping nodes" and introduce the dependencies between the groups. After all the right steps, you still need to get expert opinions.

Need to make sure you start with defining your use-cases, but by choosing an algorithm. ML is barely ever the solution to your problem. Use ensembles of algorithms and teach the algos to ask for input!  You want it to have expert input and not make assumptions!

Remember - "History is not a predictor, but knowledge is"


BH18: Kernel Mode Threats and Practical Defenses

Joe Desimone, Gabriel Landau (Endgame)

Looking at kernel attacks, as it is a method to take over the entire machine and evade all security technology. Historically, Microsoft was vulnerable to malware - not prepared for those types of attacks, but they have made improvements over the year with things like PatchGuard and Driver Signature Enforcement. PatchGuard isn't perfect, attacks get through, but MS is constantly updating so the attacks don't work for long.

Both of these technologies are focused on 64-bit kernels, which is the growing norm today.

Attackers are now using bootkits, so Microsoft and Intel have come up with technology to counter (Secure Boot, Trusted Boot, Itnel Boot Guard, and Intel BIOS Guard).

All of those protections have changed the landscape. We don't see millions of kernel based botnets out there anymore.  But now people are signing their malware to look more legitimate and trick people to install.

DUQU 2.0 was a nation state attack, main payload used 0day in win32k.sys for kernel execution (CVE-2015-2360), it was spoofing process information to route maliious traffic on the internal network.

With the introduction of virtualization based security has also made the system more secure against things like Uroburos, Duqu2, DoublePUlsar.

The MS kernel has been greatly evolving over the last 10  years to greatly improved their mitigations. But, the problem is the adoption rate. There are still a lot of systems running Windows 7, which does not benefit from these new protections.

The speakers are on their orgs red team, so they are always looking for new ways to attack the system. They want to avoid detection and signature checks - their blue team is on the lookout for user mdoe priv escalation, so they wanted to be in the kernel. Looked at sample code from Winsock Kernel, found it was very effective (no beacons).

Did find a good attack, which means they needed to improve their own security.

Modification of kernel memory can significantly compromise the integrity of the system, so this is a major area of concern.

Need chip manufacturer to ship hardware with ROP detection enabled, otherwise this will always be a vector of attack. They did this by creating a surrogate thread, put it to sleep and though foudn the location of the stack and take advantage of it. (more details in the deck, the slides move pretty fast), but the interesting thing here is how much they can do by reusing existing code.

To project yourself, you should very carefully monitor driver load events. Look for low prevalence drivers and known-exploited drivers.  You need hypervisor protection policies, using white lists (which are hard to maintain) and leverage kernel drivers to WHQL. They have made a new tool to also  help to reduce the attack surface, available on their website today.

They wrote some code to generically detect function pointer hooks, locate the function pointers by walking relocation tables and leverage Endgame Marta.  They consider it a hit if it originally pointed to +X section in on-disk copy of driver, does not pont to a loaded driver in memory and points to executable memory.
 
ROP generates a lot of mispredictions, so need to protect this area as well (they could attack by scanning drivers to identify call/return sites, configure LBR to record CPLO near returns, etc)

The talk had lots of cool demos - can't really capture it here.

Windows platform security has gotten much better, but tehre are still kernel threats. You need to be using at least Windows 10 with SecureBoot and HVCI. at a minimum to protect yourself. Requite EV/WHQL within your organization

Wednesday, August 8, 2018

BH18: Don't @ Me: Hunting Twitter Bots at Scale

Jordan Wright, Olabode Anise, Duo Labs

Social media is a great way to have genuine conversations online, but the sphere is getting filled with bots, spam and attackers.

Not all bots on twitter are malicious - they could be giving us automated data on earthquakes, git updates, etc. So, their research was focused on finding bots and then figuring out if they were malicious.

The goal here is to build a classifier, one that could learn and adapt.

They wanted their research to be reproducible, so used the official Twitter APIs - though by doing so, they were rate limited. Because they were rate limited, they needed to be as efficient as possible. Fitting into that model, they were able to look up 8.6 million lookups per day.

Twitter's account ids started as sequential 32-bit unsigned integers, but the researchers started with random 5% sampling. The dataset has gaps - closed accounts, etc. Noticed accounts went up to very large numbers, and those accounts were up to 2016. But, Twitter changed to using "Snowflake IDs" - generated by workers, same format as other Twitter ids (tweets, etc).

The Snoflake ID is 63-bit, but starts with a timestamp (41-bits), then worker number (10 bits), then sequence (12 bits). It is very hard to guess these numbers. So, they used the streaming API with a random sample of public statuses (contains the full user object).

Now - they have a giant dataset :-)

Looked at last 200 tweets, accounts with more than 10 tweets, declared English and then they fetched the original tweets.  This data was too hard to get - could only do 1400 requests/day.

They took the approach of starting from known bots and discovering the bot nets they were attached to.

The data they have include attributes (how many tweets, are they followed, in lists, etc), looking at tweet content (lots of links?), and frequency of tweets.

They examined the entropy of the user name, was it fairly random? Probably a bot. Same for lots of numbers at the begining or end. Watchin for ratios of followers to following and the number of tweets.

They applied heuristics to the content - like number of hashtags in tweets, number of URLs (could be a bot or a news agency!), number of users @ replied.  On behavior - look at how long it takes to reply or retweet, and the unique set of users retweeted.  Genuine users would go queit for periods (like when sleeping).

Then we got a Data Science 101 primer :-)

This is where it gets complicated and statistics come into play, and the reminder that your model is only as good as your data. For example, if they trained with the crypto currency bots, they found 80% of the other spam bots. when reversed, they only caught about 50% of the crypto currency bots.


 Crypto currency give-a-way accounts are very problematic - they look legitimate and they will take your "deposit" and then you will lose your money.  They were hard to find, until they realized that there are accounts are out there that have many bots following them. Find those legitimate accounts, then you can find the bots.... also following like behaviors, used to map relatinships.  They found mesh and hub/spoke networks, but they were connected with likes.

They also discovered verified accounts that had been taken over, then they are modfiied to look like a more active account (like Elon Musk) that adds legitimacy to the crypto currency spam.

Very interesting research!



BH18: There Will Be Glitches: Extracting and Analyzing Automotive Firmware Efficiently

Alyssa Milburn & Niek Timmers, Riscure.

The standard approach for breaking into embedded systems: Understand target, Identify vulnerability, exploit vulnerability. Note - he also is referring to ECUs found in cars.

To understand the embedded system, need to understand the firmware. To do so - you need to get a hold of a car! Good source for cheep cars with common components - recalled Volkswagens :-)

Today's talk is targeting the instrument cluster - why? Because it has visual indicators you can see what is happening - it has blinking lights! :-)

Inside the instrument panel you will find the microcontroller, the EEPROM. display and the UART for debugging (but, it's been secured).  So, we have just inputs and outputs we don't understand. After much analysis, discovered most instrument panels talk UDS over the CAN bus. (ISO14229). This covers diagnostics, data transmission (read/write), security access check and loads more!

The team identified the read/write memory functions, but also discovered they were well protected.

Discovered that there are voltage boundaries, and if they go out of bounds they can stop the MCU. But... what if we do it for a very short amount of time? Will the chip keep running?

Had to get fault injection tooling - ChipWhisperer or Inspector FI - all available to the masses.

Fault injectors are great for breaking things. Once a glitch is introduced, nothing can be trusted. You can even change the executed instructions - opens a lot more doors! If you can modify instructions, you can also skip instructions!

They investigated adding a glitch to the security access check. Part of the check has a challenge, and if the expected response is received - access is granted. The team tried adding a glitch here, but were not successful, due to 10 minute timeout after 3 failed timeouts. As they are looking for something easy... moved on!

So, they moved on to glitching the ReadMemoryByAddress - no timeout here! They were successful on several different ECUs, which are designed around different MCUs.  Depending on the target, they could read N bytes from an arbitrary address. It took a few days, but were able to get the complete firmware in a few days.

There are parameters you can tweak for this glitch - delay, duration and voltage. Lots of pretty graphs followed.

It's hard to just do static analysis, as there is often generated code.

So, they wrote an emulator - allowed them to hook into a real CAN network, add debug stop points, and track execution more closely.

By using taint tracking, were able to find the CalculateKey function with the emulator.

There are new tools coming or electromagnetic fault injection - expensive right now, but getting cheaper.

ECU hardware still needs to be hardened - things like memory integrity and processing integrity. Unfortunately, these are currently being only designed for safety (not security).

There should be redundancy and the designers should be more paranoid. ECUs should not expose keys - need to leverage HSMs (hardened cryptographic engine). Highly recommend using asymmetric crypto - so the ECU only has a public key.

Do better :-)



BH18: Blockchain Autopsies - Analyzing Ethereum Smart Contract Deaths

Jay Little, Principal Security Engineer, Trail of Bits
Trail of Bits is a cyber security research company - high end security research and assesments.

Earlier this year he was working on a project with a friend to look into an aspects of contracts

Ethereum, EVM and Solidty

Ask for a show of hands about who has bought Ethereum here, lots of hands went up.

Ethereum is a blockchain based distributed ledger, called a "world computer" and has "smart" contracts. It is the 2nd largest crypto currency.

The Ethereum Virtual Machine (EVM) is a big endian stack machine with 185 opcodes, native data width is 256 bits, whith many similar instructions. Each instruction has a 'gas cost' to prevent infinite loops.

Most contracts start at 0, there are 5 addresse spaces. Most people don't write their contracts in EVM, but use Solidty instead - it's a JavaScript inspired high level language for smart contracts. It has evolved (as opposed to being designed).

Much of the presentation is done with emoji's - easier to see than a string of numbers :-)

 Because contracts start at zero, he has seen undefined behaviors when counters get decremented too low.  ALso issues with unintialized variables - used in one case to backdoor a lottery system.

There is a new tool, Rattle, recovers EVM control flow.  Other tools, Geth and Party, run on public nodes. This followed by a walkthrough of using the tools and their CLI options and looking at a some contracts.  He shared the code for finding contracts as well.  Geth and Parity have a lot of issues, so he's been looking at etherscna.io - a quick lookup database.

Doing a hybrid approach of using Geth and Parity to find the contracts over a few hours, then look into eherium.io.  Looking at 6M blocks, about half are duplicates. Some are empty, but have a balance - which shouldn't happen.

Sometimes the contracts fail, because they did not use enough 'gas' . Found a contract with no code (unusable) but with about $7000 in it - stuck there forever.  All told, there is about $2.6M stuck in empty contracts that can never be retrieved.

Some duplicates have infinite loops - could be intended as a network DoS. Others seen with noise or spam, or NUL value issues

From tracing they were able to look into contracts where the self destruct was not the original creator - they tend to send the money to address 0, losing it forever. 

If you are developing contracts, make sure you understand and fix all warnings. Add an Echnidan test and write extensive positive and negative tests. Most importantly, perform a rigorous assessment!






Tuesday, June 5, 2018

Learning Ally: Books I've Narrated

Working with Learning Ally, I record textbooks and novels for the blind and dyslexic, along with others that learn differently.

I've been keeping this list on LinkedIn, but hit the LinkedIn character maximum. I didn't always keep track, so there may be a few more books. I started volunteering at Learning Ally in Palo Alto in August 2012, followed them to Menlo Park and am preparing to start volunteering from home.

When I started, we had physical books we read from and we've since moved to VoiceText (scanned texts) and PDF books. This makes it easier to start recording at home!

Here are the books that I've narrated over the years. I'll continue to add to this post as I complete more books.  The hours listed are total length of the finished narration. It takes usually 3 times as long recording and correcting to get that finished product.

Recorded in 2018
Recorded in 2017
  • Tales from a Not-So-Friendly Frenemy (Dork Diaries #11) (Rachel Renee Russell) (248 pages, 2.17 hours)
  • Tales from a Not-So-Fabulous Life (Dork Diaries #1) (Rachel Renee Russell) (282 pages, 3.08 hours)
  • The San Francisco Earthquake (I Survived #5) (Lauren Tarshis) (98 pages, 1.27 hours)
  • Shadows of Sherwood (Robyn Hoodlum #1) (Kekla Magoon) (356 pages, 7.48 hours)
  • Mythology (Edith Hamilton) (475 pages, 11.35 hours)
  • Carve the Mark (Veronica Roth) (467 pages, 12.14 hours)
  • Goosebumps Book 8: The Girl Who Cried Monster (138 pages, 2.50 hours)
  • Goosebumps Book 3: Monster Blood (R. L. Stine)
Recorded in 2016
  • Ink and Bone (Rachel Caine) (354 pages, 10.97 hours)
  • Dragons of Winter (James A. Owen) (389 pages, 9.38 hours)
  • Tru & Nelle (G. Neri) (328 pages, 4.70 hours)
  • City of Ice (Ken Yep) (362 pages, 8:47 hours)
  • Winter: The Lunar Chronicles (Marissa Meyer) (828 Pages, 20 hours)
Recorded in 2015
  • If You Could Be Mine (Sara Farizan) (248 Pages, 4:59 hours)
  • The Vanishing Game (Kate Kae Myers) (356 pages, 7:45 hours)
  • A Northern Light (Jennifer Donnely) (396 pages, 8:57 hours)
  • Liar Temptress Soldier Spy: Four Women Undercover in the Civil War (Karen Abbott) (513 pages, 12:20 hours)
  • The Spiritglass Charade (Collean Gleason) (360 pages)
  • Wicked Girls (Stephanie Hemphill) (389 pages)
Recorded in 2014
  • The Wicked and the Just (J. Anderson Coats) (342 pages, 7:30 hours)
  • The Spy Catchers of Maple Hill (311 pages)
  • California Driver Manual (106 pages, 4:15 hours) (Yes, DRIVER, not Driver's ... )
  • Unbroken: A Ruined Novel (Paula Morris) (295 pages)
  • Froi of the Exiles (Marlena Marchetta) (598 pages, 16:53 hours)
  • The Amazing Monty  (Johanna Hurwitz)
Recorded in 2013
  • Every Other Day (Jennifer Lynn Barnes)
  • The Last Dragonslayer (Jasper Fford)
  • The Red Convertible
  • Michael's Mystery
  • Inkheart



Friday, May 11, 2018

ICMC18: Update from the "Security Policy" Working Group

Update from the “Security Policy” Working Group (U32a) Ryan Thomas, Acumen Security, United States

This was a large group effort, lots of people coming together. The security policy is used by the product vendor, CST laboratory (to validate), the CMVP, and user and auditor (was it configured in the approved fashion?)

The working group started in 2016, to set up a template with the big goals of efficiency and consistency. When they started, were focused on tables of allowed and disallowed algorithms (non-approved), creation of a keys and CSPs table, approved and non-approved services and mapping the difference.

But, the tables were getting unwieldy, and we were told there were changes coming. Then folks started getting ideas on adding it to the automation work. So, the group took a break after helping with the update of IG 9.5.

Fast forwarding to today, many modules leverage previously validated libraries (OpenSSL, Bouncy Castle, NSS) so the documents should be very similar... but not always. Often still very inconsistent. New goal is to target 80% of the validation types, not all.

Creating templates and example security policies will have people coming from a common baseline. This will be less work for everyone, and hopefully get us to the certificate faster!

By Summer 2018, hope to have a level 1 and level 2 security policy. This is not a new requirement, but a guideline. It will point you to the relevant IGs / section and just help you streamline your work and CMVP's work.

Need to harmonize the current tables to the template. It will be distributed on the CMUF website when complete.

While doing this work, discovered a few things were not well understood across the group (like requirements for listing the low level CPU versions.

Got feedback from CMVP about what are the most common comments they send back on security policy - like how does module meet IG A.5 shall statements? How does XTs AEs meet IG A.9? Etc.

ICMC18: Keys, Hollywood and History: The Truth About ICANN and the DNSSEC Root Key

Keys, Hollywood, and History: The Truth About ICANN and the DNSSEC Root Key (U31c) Richard Lamb, Self-Employed, United States

Started with a segment about the Internet phone book from a television show.  Richard notes they got a lot of things right, but instead of breaking up the code into 7 cards - there are indeed 7 smart cards and 7 people all over the world that help ICANN.

Did a quick demonstration of how DNS works in the room, and learned about how important it truly is. Dan Kaminsky's DNSSEC exploit at DefCon 2008 at least drew attention to how important DNS is.

the other source of trust on the Internet is CA Certificate Roots, and encourage all web traffic to be encrypted.

Four times a year, people really do get together to do a public key ceremony. You can come watch if you want - just like they said in TV!  There are at least 12 people involved in the key ceremony, due to the thresh holding schemes by HSM vendors. The members must be from all over the world, cannot be all (or even mostly) Americans. They are Trusted Community Representatives (TCRs).

The Smart Cards are stored in a credential safe. The HSM is in a separate safe, there are iris scans available.  It is all live recorded and in a secure room. Process is certified. Shielded spaces, protected tamper evident bags (changed bags after someone was able to get into the bag w/out evidence).

The presentation moved very fast and lots of interesting things in there - can't wait to get access to the slides.





ICMC18: Panel Discussion: The Future of HSMs and New Technology for Hardware Based Security

Panel Discussion: The Future of HSMs and New Technology for Hardware Based Security Solutions (A31a) Tony Cox, Cryptsoft, Australia; Thorsten Groetker CTO, Utimaco; Tim Hudson, Cryptsoft, Australia; Todd Moore, Gemalto, United States; Robert Burns, Thales, United States

All of the panelists have a strong background in cryptography and HSMs. Starting out by defining HSMs, a secure container for secure information. Needs extra protection, may have acceleration, may be rack mounted, smart card, USB, PCMCIA card, appliance, etc. - maybe even software based.

Or, is it a meaningless term? It could be virtual, it could be a phone, it could be in the cloud - anything you feel is better than things that aren't HSMs, Tim postured.

Thorsten disagrees - there has to be a wall and have only one door and strong authentication.

Bob noted that overloading the term does cause confusion, but should not dilute what are good hardware based HSMs.

Tim notes that people buy their HSMs by their brand, not always for their features and a deep evaluation of the underlying project. Thorstein agrees that may happen in some cases, or they may be looking for a particular protection profile or FIPS 140 level.  Bob notes that branding and loyalty plays a part, but does think people look at features. Tim said he's been in customer conversations where people are influenced by the color of the lights or box.

Bob mentioned that it's not easy to install an HSM, so you're only doing it because you need it or are required to have it.

The entire panel seems to agree (minus some tongue in cheek humor) that easier to configure is important, more likely to be installed correctly.  But, still a way to go - customers are always asking for how to do this faster and more easily.  This may be leading to more cloud based HSMs.

Bob - There are trade offs - we can't tell them what their risk profile is and what configuration is right for them.

On security, Thorstein notes that some customers may be required to use older algorithms, and he recommends doing risk assessments, and just because you are writing a compliant (say "PKCS#11) application does not mean it is secure.  Having standards based makes migrating and interoperability a lot easier, but it does not always meet all of your business needs. This is why most HSM vendors make their own SDK as well.

Bob agrees, that a universal API means punting on some tough problems. As a vendor you can choose being fully compliant, or locking in your customer to your API.  PKCS#11 is great, but it is a C API - where is the future going? Need more language choices.

Tony asks - given the leadership of PKCS#11 team in the room, what could we do better? Tim makes a comment on KMIP, Bob agrees it's important but still not fully portable. Bob thinks there's an opportunity to look at the problem in a different way for PKCS#11 - the implementation is locked into C, which is no longer on the growth curve for our customers. People what the more managed languages, so people are creating shims over PKCS#11.

Thorstein likes the aggregate commands in KMIP, but still not perfect.

Todd noted we need RESTful based APIs, and there are gaps in what the standards are offering.

Tim notes that he doesn't think the vendors are always clear with their customers that they are going down a path of getting locked in. Bob disagrees that vendors are doing this on purpose.

Valerie couldn't help but note that standards are only as good as the people that contribute to them, and if the vendors are finding gaps in PKCS#11 or KMIP, please bring those gaps to the committees and join them and help to improve them.

Tim notes that there are more hardware protections available to software developers (ARM trust zones and Intel's SGX). Bob notes that they are interesting technologies, but not a true HSM, not as strong of a container. Additionally, key progeny and ownership is an issue as those keys are owned by specific companies. It would be good to expand this, particularly in the cloud space.

Thorstein believes the jury is still out - interesting approaches, but not quite there for putting the level of trust you would put into a level 3 HSM. If a US / American vendor has a kill switch that could stop your whole system from running, it's much less appealing for those of us outside of the US.  Worry about what other things could be exposed in that way - it's like a good new cipher; need to look at it and how it is implemented.

Todd notes these technologies are starting to get very interesting because they can go into edge devices and cloud services. We are excited to see how this are going to grow. Vendors still need to provide key life cycle guidance and standards compliance and making sure CIA are in place.

Thorstein notes it is a good building block of an embedded HSM, but he'd still be nervous about sharing the CPU.

Tim says it sounds like it's better than software alone, but not up to these vendor's HSMs. Bob remembers the time that HSMs used to be needed to get any decent performance, and they are already very different just 15 years later and expects another incarnation in 15 years.

Todd notes that Google just launched a silo that would leverage these technologies and managed SDKs. Bob agrees that middleware can benefit from technologies like SGX. Tim notes standards are still very important, and wants users to communicate this to vendors.

Lots more of excellent conversation.



ICMC18: TLS Panel Discussion

TLS Panel Discussion (S30b) Moderator: Tim Hudson, CTO and Technical Director, Cryptsoft Pty, Australia; Panelists: Brent Cook, OpenBSD, United States; David Hook, Director/Consultant, Crypto Workshop, Australia Rich Salz, Senior Architect, Akamai Technologies & Member, OpenSSL Dev Team, United States;

There are quite a few TLS implementations out there, in a variety of languages. David thinks this is generally a good thing, gets more people looking at the specification and working out the ambiguities. Brent agrees, it gets more people looking at it, lowers the chance of one security issue impacting all implementations. Rich noted that in the past that the way OpenSSL did it was the "Right Way" and people would write their code to interoperate with them, as opposed to against the specification, but he thinks it's better to have more as they fit different areas (like IoT).

There are a lot of implementations out there using the same crypto implementations, ASN.1 or X.509. That can be good, like the Russian gentleman who writes low level assembly to accelerate the algorithms - so everyone can be fast, but it's still good to see alternative implementations.

All of the panelists hear from their customers, getting interesting questions.  They generally have to be careful about turning things off, because you never know who is using an option or for what.

Bob Relyea noted users should be cautioned if they think they should write their own TLS library, when there are several very good ones out there. Forking is not always the answer, because it reduces the number of people looking at each implementation.  Let's make sure the ones we really care about have the right folks looking at them.

Brent notes that for him (OpenBSD) are more focused on TLS for an operating environment, and they are glad they forked. If OpenSSL hadn't wrapped their memory management like they did, folks and tools at OpenBSD would've found Heartbleed sooner.

Rich discussed the debate IETF had been having with financial institutions who wanted a way to observe traffic in the clear. The IETF did not want this and said no, they are a paranoid bunch. this means some companies won't be able to do with TLS 1.3 that they may have been able to do before. Encrypted will be encrypted.

Brent makes some deep debugging and injection tools, and also agrees, you don't want there to be an easy way to decrypt banking traffic.

Lots of great questions with quick answers that were hard to capture here, but a very enjoyable presentation.

ICMC18: TLS 1.3 and NSS

TLS 1.3 and NSS (S30a) Robert Relyea, Red Hat, United States

PKCS#11 is the FIPS boundary for NSS. AES/GCM presents a difficulty in the PKCS#11 v2.X API, but will be addressed in v3.X. While in FIPS mode, keys are locked tot he token and cannot be removed in the clear. This means their SSL implementation doesn't have actual access to the keys - so MACing, etc, needs to happen within NSS's softtoken.

In NSS's FIPS mode, only allowed FIPS algorithms were on. This caused problems for accessing things like ChaCha, so now they are only locked in the security policy.

The TLS 1.3 engine in NSS is very different than 1.2. We rewrote the handshake handling state machine. We have finally dropped support for SSL 2.0 support altogether and have notified customers that SSL 3.0 is next (currently turned off). TLS 1.3 uses a different KDF as well, already had support for HKDF through a private PKCS#11 NSS Mechanism.  Essentially, everything (but the record format) has changed.

The implementation was done by Mozilla, primarily by Eric Rescorla and Martin Thompson. They had to rewrite the state machine. We wanted customers to start playing with the software, but due to the way it's configured, they sometimes got it on accident (by applications choosing the highest available version of TLS).

When will you see this? It's fairly complete in the NSS upstream code, but nobody has released it, yet. Draft 28 of TLS 1.3 was posted on March 30, 2018. We doubt there will be any further technical changes. The current PKCS#11 is sufficient, other than the KDF. The PKCS#11 v3.0 spec should be out by the end of 2018. Still gathering final proposals and review comments into the draft. HKDF missed the cutout... Bob will work on taking HKDF through the PKCS#11 process as the 3.0 review moves forward, to hit the next version of the specification.

How do you influence NSS? The more you contribute, the bigger say you can get to influence the direction.

Thursday, May 10, 2018

ICMC18: KMIP 2.0 vs Crypto in a Cybersecurity Context

KMIP 2.0 vs Crypto in a Cybersecurity Context (G23c) Tony Cox, Cryptsoft, Australia; Chuck White, Fornetix, United States

They are both co-editors of the KMIP v 2.0 version of the specification. They are big fans of standards.

Wrapped up KMIP v1.4 in March 2017 (published in November 2017) and scoped KMIP 2.0. In January 2018, KMIP 2.0 working draft is out and had another face to face in April 2018.

Restructured the documents, removed legacy 1.x artifacts. Problems we create today will impact us for the next 10-15 years, so working with that in mind. Want a way to be able to make changes easily as needed in the future. The focus is on data in motion. We want to lower barrier for adoption, and make KMIP more accessible as a service with flow control and signaling for transaction of Encryption Keys and Cryptographic Operations.

Passwords are now sent as hashed passwords, including a nonce to prevent replay. We have an effective double hash, so no longer need to store passwords in the clear. We've had the concept of OTP (One Time Password) for a long time, but it's better defined to make it easier to use and interoperate.

We've also addressed login and delegated rights. Login is a simple mechanism to reduce authentications. Leveraging tickets to improve performance. Allows for delegation and supports 2FA. This will broaden KMIP applicability.

Flow control allows server initiated commands to clients with the client initiated connections. Allows the server to be a trust node managing encryption keys on other devices that are inside or outside a system perimeter.  Best of all - does not break the existing method of establishing a KMIP session (does not break clients).

Multiple ID Placeholders allows for simpler execution of compound key management operations. Provides a path to combine traditional key management operations and HSM operations into a single KMIP operations. Addressing broader concern of IoT and Cloud. Also added some changes to make dealing with offline devices easier.

Digest values usable between client and server - deterministic. Clients can rely on servers! Addresses a major source of non-interoperability.

There is now a concept of re-encrypt. What happens when you have an object that's been encrypted, and you want to rotate the keys? Don't want to expose keys while doing transitions. This new method allows the keys to stay in the server (w/in the FIPS boundary). Enabling rekeying more often. This is future proofing for post quantum crypto when we know people will need to rekey.

There are some default crypto parameters, allows parameter agnostic clients. Cryptography can change on the server side and not change the method in which it is requested by the client. Server can provide defaults if the client does not.

All of these features were to improve crypto agility and resilience, easier to use, and allows KMIP to be more impactful for Data in Motion, IoT, Distributed Compute and Cloud.

Implementation work is starting next, along with final reviews and hopefully close out any final issues in the next few months. Hope to publish in 2018!

This work is the culmination of our efforts and learnings in the key management space over the last 9 years as a standard's body. But, if you have requirements the standard is not handling, come and join us or let us know what the issues are. 


ICMC18: OpenSSL FIPS Module Validation Project: An Update

OpenSSL FIPS Module Validation Project (S23a) Tim Hudson, CTO and Technical Director, Cryptsoft Pty, Australia; Ashit Vora, Acumen Security, United States

Tim was part of OpenSSL before it was called OpenSSL. He's co-founder and CTO at Cryptsoft and a member of the OpenSSL Management Committee (OMC). Co-editor and contributor across OASIS KMIP and PKCS#11 technical committees.

Ashit is the co-founder and lab director at Acumen Security, acquired by Intertek Group last December. He has 15 years of certification and security experience. He is providing advice to OMC on their FIPS validations.

OpenSSL has been through validation 8 times, 3 certificates are still valid. Note the version number of validated modules is not a direct correlation to the actual OpenSSL version number. None of these modules work with OpenSSL v1.1. Cannot update the current modules, as they don't meet new IGs nor power on self test requirements. If you want to do your own revalidation, you have to fix those.

Haven't been able to do FIPS 140, yet, due to being so busy with all of the other features and work that needed to be done first (TLS 1.3, and all of the things in the previous discussion). Needed a good stable base to move forward for TLS 1.3.  The fact that IETF is still finishing TLS 1.3 gave them lots of time to polish the release and add lots of great features and algorithms.

The team is finishing off adding support to TLS v1.3, but it's taking longer than expected. This has delayed the kick off of FIPS validation effort. However, commitment to FIPS validation is unwavering. It is important to the committee. We are doing it a way that we feel is long term supportable, and not an add on.

The idea is to keep the FIPS 140 code to be in a usable state for future updates. There will be a limited operational environments (OEs) tested (previously there were over 140! Doing that many is expensive and time consuming). Will also be only validating limited number of algorithms.  They plan to do source distribution as per previous validations.

At this point in time, there are no plans to add additional platforms to base validation, get it right and make it usable. As new platforms need to be added, other parties can do it independently.  Want to get this solution in place as soon as possible.

FIPS algorithm selection is planned to be functionally equivalent to the previous FIPS module. There won't be a separate canister anymore. It will be a shared library (or DLL) as the module boundary. Static linking will not be supported. It will be an external module loaded at runtime. It will look more like other validations, which should help overall. 

The interface will NOT be an OpenSSL Engine module. Don't want to be constrained by the engine limitations - it will be something else (TBA).

Have already fixed the entropy gathering and RNG.

This won't be a funny standalone release, it will be aligned with a future standard OpenSSL release (once solidified, we will tell you - it will be after the OpenSSL 1.1.0 release).

Will move to FIPS 186-4 key generation, NIST SP 800-56A, added SHA-3 and built in efficiency of POSTs.

Current sponsors: Akamai, NetApp and Oracle. They are contributing to making OpenSSL FIPS a reality. Any other sponsors interested? You need to contact OCM within the next 90 days, you can contact the alias or an individual on the OCM to start.

Next steps: Finalize planning! What functionality is in and what's out? which platforms?  then we have to begin the development process. We expect to publish design documents and have public pull requests for review.  We will be doing incremental development and we aim to minimize the impact on developers. We want feedback earlier from experienced FIPS 140 people and OpenSSL developers.

Adding more sponsors should help speed up the process, as long as they understand this will be an open process and they are willing to work within those constraints.


ICMC18: OpenSSL Project Overview

OpenSSL Project Overview (S22c) Rich Salz, Senior Architect Akamai Technologies & Member, OpenSSL Dev Team, United States

Covering what's new since last year's update at ICMC17.  Post heartbleed, the project started a recovery effort. LibreSSL forked, and several older releases were EOLed. Started 1.1.0 in 2014 (depending on who you ask), and working on hiding all the structures. Google then started their own fork (BoringSSL). Then the team released 1.1.0!

OpenSSL 1.0.2 is supported through the end of 2019, last year is only security fixes. Extended by a year as the next LTs release wasn't ready. 1.1.1 will be the next LTS release and 1.1.0 will only be supported for 1 year after that. (security fixes only).

Very close to reaching exit criteria for 1.1.1 - want a final beta period after IETF RFC for TLS 1.3 is published (soon!). It's in editorial review, hoping nobody finds a major technical flaw at this point.  1.1.1 should be source and binary compatible with 1.1.0. Focus of the next release is FIPS.

Current CMVP 1747 expires in 2022 and we're not touching the 1747 code anymore. It's not on the historical release. 1747 is based off of OpenSSL 1.0.2, so there will be a gap.

Start porting your applications to the master gate. 1.1.1 has the same API/ABI as 1.1.0 and therefore the big "opaque" changes. FIPS will be moving forward, not backward.  You will interop on TLS 1.3.

Last HIGH CVE was in February 2017, found by fuzzing and it was a crash.  Before that, it was November 2016 (also fuzzing and also a crash). Got a grant from Amazon to create a fuzzing database. Prior CVE was in September 2016 (found by 3rd party, a memory growth leading to probably crash).  We call them CVEs so downstream will know to pick them up.

Everything for OpenSSL is now down on GitHub. It's added features to make it easier to do things. Every pull request is built 7 different ways with different options and various OSes.  Every pull request has to go through this CI process and must have a clean pass.

We have an active global community now - people from Amazon, Facebook, Google, Intel, Oracle, China (Ribose, Baishan), Russia (GOSt ciphers).  It's good to be open source and great open source contributors.

The OMC meets annually face-to-face. Most folks don't believe we can fill 2 days... but always fill the time. With the exception of private finance items and release level stuff, everything is posted to the openssl-project mailing list.  Added video conferencing this year, and remote team members stayed online for the full 8 hours.

Features!

Protocol handling uses a safe API, no more of this: len = (p[0] << 8) | p[1]; read (ssl, buff, len); - Now use safe API which understands TLS protocol! no more open coding of protocol messages.

New infrastructure - native threads support. DRBG-based CSPRNG. ASYNC support, Auto-init and cleanup, uninvited build system, system-wide config files (able to turn off algorithms and specific features), new test framework (no new API w/out a unit test).

New cryptography! X25519, Ed25519, Ed448, Cha-Cha/Poly (DJB & Co), SHA3, SM2/3/4, ARIA, OCB, many old/weak algorithms disabled by default (still in source). New policy: only the EVP layer is supported, and only standardized crypto.

New network support. IPv6 revised and now complete, for example.

Did an external audit and addressed the code quality issues that came up. Getting better at responding to reported issues and bugs. More and better documentation, all things in the main section should be documented. Lots of old code (ifdef options) removed. Will only take new crypto that has been approved by a standard's body.

And.. TLS 1.3. It works, people are using it in production.  It interoperates! It is different, so new issues and configs to think about. We know people are using it... but nobody is complaining (yet). Can't say at this point where the traffic is coming from (customer confidentiality), but it is coming in.

In the 1.1.0 release, there won't be FIPS. See Tim's next session for more details:-)

Still working on changing the license, can't commit to when / which release it might be.

Code is not noticeably smaller, there are ports for embedded device.

ICMC18: Avoiding Burning at Sunset - Future Certification Planning in Bouncy Castle

Avoiding Burning at Sunset – Future Certification Planning in Bouncy Castle (S22b) David Hook, Director/Consultant, Crypto Workshop, Australia

If you end up on the FIPS 140-2 historiaal list, you cannot be used for procurement by any Federal Agencies.  Agencies trying to do so must go through a risk management decision in order to do so. It is also getting harder to rebrand or relabel someone else's certification.

If you've only done basic maintenance, that won't 'reset the clock' on your validated module - you have to do at least a 3 SUB, which is not a full validation, but still a lot of work. The key is the module must comply with current requirements.

Java has moved to "6 month" release cycles with periodic LTs releases. SOme of the new algorithms, such as format-preserving encryption and new expandable output functions do require revamping the API.  And.. post quantum... need to consider.

We had to split the Java effort into 3 streams - a 1.0.X stream representing the current API and 1.1.X representing the newer API.

the plan is we can do updates to the 1.0.X stream with minimal retesting. The 1.1.X stream will require recompilation and more work.

All of the updates have to comply with the current Implementation Guidance, which is changing at unspecified rates.  A wise product manager will want to keep this in mind when doing long term planning.

Premier support for Java 9 finished in March 2018 and premier support for Java 10 finishes in September 2018.  Java 11 will be supported until September 2023, extended support until September 2026.

there is now an add on for Java FIPS to allow use of post-quantum key exchange mechanisms with KAS OtherInfo via SuppPrivInfo.



Bouncy Cancel 1.0.2 will still be targeting Java 7, 8 and 11. The older versions are still very popular. Will be doing a correction to X9.31 SHA-512/256 (8 plus 1 is 10?). Who uses this? Banks... Will also be adding SHA-3 HMAC, SHA-3 signature algorithms.

BC-FJA 1.1.0 will have updates for format preserving encryption (SP 800-38G), CSHAKE, KMAC, TupleHash and ParallelHash (SP 800-185), ARIA, GOST and CHaCha20, Poly1305. Avoiding some algorithms due to patents - trying to chase down someone to talk to who can speak for the patent holders. (acquisitions make this hard...)

We now have a bunch of Android ports - stripy castle. Could not use any other name, because Google's use of org.bouncycastle as well as the org.spongycastle.

C# 1.0.1 is closer to Java 1.0.2 in some respects. We are still concentrating on the general use API. Assuming enough intersst we will do a 1.0.2 release to fix X9.31, complete SHA-3 support and complete ephemeral KAS support.



ICMC18: Keynote: Challenges in Implementing Usable Advanced Crypto

OS Crypto Track Keynote: Challenges in Implementing Usable Advanced Crypto (S22a) Shai Halevi, Principal Research Staff Member, IBM T. J. Watson Research Center

Advanced crypto goes beyond cryptography - includes proofs and things that complement use. We need it to be fast enough to be useful.

Your privacy is for sale - we give up privacy for services (directions, discounts on groceries, restaurant recommendations), we give up health data to look up personal medical solutions.

Data abuse is the new normal - the entire IT industry is making It easier to abuse. Larger collections of data, better ways to process them. It will get worse! If there opportunity is there to abuse, it will be abused.

Advanced cryptography promises blindfold computation - the ability to process data without ever seeing it - getting personalized services without giving access to your private information. Useful more traditional uses as well, like key management, but that's not the focus of this talk.

Zero knowledge proofs have been around for a long time (mid 80s?), are the concept that I have a secret that I don't want to tell you, but I can convince you of properties of my secret which should be enough to prove my secret.

You can use this for grocery history - I can prove that I bought 10 gallons of milk this month, so I can get a coupon, without revealing everything else that I bought.

The next concept is secure multi-party computation. We all have our individual secrets. We can compute a function of these secrets w/out revealing them to each other (or anyone else!). Has been around since the 1980s.

You could use this with medical data to determine the effectiveness of some treatment.  Data for different patients are held at each clinic, but the effectiveness can be shared.

The other concept is homomorphic encryption. Data can be processed in encrypted form and the result is also encrypted - but inside is the result of the function. Has been described in papers going back to 2009.

I could encrypt my location and send it to Yelp, Yelp computes an encrypted table lookup and gives me ads for nearby coffee shops. I could then get back encrypted results and then get coffee. :-)

Improving performance has been a major research topic for the last 30 years - we've made progress, but it will take a lot of very knowledgeable engineers to implement it.

Digital currencies  need to prove that you have sufficient unspent coins on the ledger, constructing the proof in less than 1 min and verify in a few microseconds - this needed the performance improvements to get it to perform that well.

You can use these encryption techniques and the speed improvements to find similar patients in a database in less than 30 seconds, or compute private set intersections.

By speeding up homomorphic encryption, you can compute the similarity of two 1M-marker sequences in minutes, or inference of simple neural-nets on encrypted data. 

But - all of these are complex, so not generally available.

There are a lot of software libraries that implement ZKP / MPC / FHE - most are open source, but it's very hard to compare the, decide which to use for what.  They have different computation models,performance profiles, security guarantees and there are hardly an accepted benchmarks.

Distributed computing is already very complex by itself. Adding advanced cryptography into it makes it that much more complicated (needs oblivious computation). Good performance needs extreme optimization - straightforward implementation will be terribly non performant. You need to be familiar with the techniques to optimize for what you're trying to do.

Communication between parties is the bottleneck in many protocols for secure multi-party computation. To optimize, many libraries work with sockets - they expect to be "in charge" of IP-address:port.  Retrofitting existing libraries is also very complicated.

How can you tame the complexity? You need frameworks and compiler support, tool boxes for common tasks and to shift our focus to usability.

We need to engage cryptographers and system builders to make this happen.





ICMC18: Panel Discussion: Technology Challenges in CM Validation

Panel Discussion: Technology Challenges in CM Validation (G21b) Moderator: Nithya Rachamadugu, Director, CygnaCom, United States Panelists: Tomas Mraz, Senior SW Engineer, Red Hat, Czech Republic; Steven Schmalz, Principal Systems Engineer, RSA—the Security Division of EMC, United States; Fangyu Zheng, Institute of Information Engineering, CAS, China

All three panelists have been through their share of validations, and Fangyu has also had to deal with the Chinese CMVP process.

As to their biggest challenges, everyone agrees that time is the issue. Tomas noted that it's very difficult to get the open source community excited about this and doing work to support the validations. For Fangyu, they often have to maintain 2 versions of several algorithms, one for US validations and one for Chinese.

In general, it's hard to find out what the requirements are from most customers here, particularly across various geos.

Several panelists agree this is seen as an expensive checklist. Steven also worries about the impact on business - it goes beyond what you pay the labs and the engineers to write the code. It's hard to get this done and get it to all the customers. Tomas noted that there are conflicting requirements between FIPS and other standards (like AES GCM, though that has been recently addressed).

On value of the certification, have you found anything during a validation that made your  product more secure? Steven notes you can talk about methodologies for preventing software vulnerabilities, and the devs will come back and say why didn't it work for "so-and-so"? But if you look specifically at the testing of the algorithm, it gives you value that you've implemented the cryptography correctly.  Not clear we get as much benefit out of the module verification.  Agreement across the panel that CAVP is valuable, technically.

Steve really wishes there was a lot more guidance on timing of validations and how to handle vulnerabilities. Tomas notes it's hard to limit changes to the boundary in the kernel, because we need to add new hardware support and other things. Fangyu noted that even rolling out fixes for a hardware module is challenging.

All panelists are excited about the automation that is happening, though Steven is wondering if it will really be possible for the module (algorithm testing seems very automatable, and that will still help).  Steven talked about industry trend to continuously check status of machines, make sure they are up to date with patches, etc - getting this automation in can help people continuously check their work, even on development modules.

All panelists noted that the validation process could be improved, but it won't help the overall security of the system.

Customers want Common Criteria and FIPS 140-2, but don't really understand what it means, they just want to make sure it's there. Try to do them both at the same time is difficult to line up the validations and making sure all teams understand when they need it. And... they both still take too long to get.

On the topic of out of date or sunset modules - it's unclear how many customers may be running these, but Steven has heard support requests come in for out of date modules. They use that as an opportunity to get them to upgrade. Tomas noted they won't likely be able to "revive" the sunset module, due to how quickly the IG and standards change.

ICMC18: 10 Years of FIPS 140-2 Certifications at Red Hat

 10 Years of FIPS 140-2 Certifications at Red Hat (G21a) Tomas Mraz, Red Hat, Czech Republic


Red Hat was founded in 1993, received first FIPS 140 validation in 2007 with Sun Microsystems with NSS (Network Security Services), which was designed to be FIPS 140 compliant from the start.
We spent a lot of effort in Red Hat Linux to get everything to use NSS (cURL, RPM, OpenLDAP, OpenSWAN), but could not convert everything. Too many differences in APIs.

So, changed to “validate everything” mode!  OpenSSL based on the original FIPS module from OpenSSL, but partially evolved for Red Hat.  For OpenSSH, did their own independent FIPS work. Libgcrypt, hired a community developer to integrate FIPS support upstream.  For OpenSWAN and later libreswan, hired the community developer to port to NSS and then get FIPS support.  DM-crypt first had its own crypto, but later switched to use libgcrypt. GnuTLS was done after Red Hat hired the main developer of the project.

Highlights – we were able to do this at all. We could do it quite quickly with existing modules, and we never included Dual-EC DRBG so avoided big issues there. Some small implementation bus were found by CAVs testing.

Lowlights – process is still to slow and expensive to be able to revalidate everything we release, creating a conflict between fixing bugs and security issues and the need to have the software validated.  Sometimes new crypto has to be disabled in the FIPS mode (even though its security is well established, like ChaCha20-Poly1305, Curve 25519 DH).  Some of the requirements really are for hardware, and don’t make sense for software and implementing them does not improve the software.
Lowlights – more! The restrictions are too tight on the operating environment. HW requirements are ignored by customers and other products built upon RHEL are marketing under a different name – confusing!  The open source community does not care about government customers, so call it nonsense, silliness and garbage.
We needed to make the process of turning on FIPS mode, so it would not interfere with regular customers that don't care about FIPS mode at all.  This is all more restricted in containers as well (both host and container must be RHEL, for example).
In libgcrypt, non-approved algorithms are blocked, things like MD5, annoying to customers.
In the future, want to continue to work with NIST to improve the process and continue to work on the ACVP project, to speed up revalidations.  We may have more or less crypto modules, less if we can get more utilities to use our validated libraries (like move SSH from using OpenSSH crypto to OpenSSL).
 

Wednesday, May 9, 2018

ICMC18: FIPS 140-3 Update

FIPS 140-3 Update (C13c) Michael Cooper, IT Specialist, NIST, United States

Mr. Cooper would love to give us a signature date, but... he can't. (out of his control). There are a general set of documents that point to ISO 19790 and ISO 24759, it's gone through the NIST processes (legal reviews, etc) now we are at the last stage: waiting for the secretary of commerce to sign. This is a timing thing - wheels are in motion.

The document that's going in for signing is just a wrapper document, basically pointing only to those other documents and no modifications.

Hoping that by leveraging an international standard, then this will simplify testing requirements for vendors. Already going to CC meetings to see who else is interested in this, and looking into automation for this as well.

Standardizing testing, especially across NIAP and CC, then this will help extend the adoption of the standard.

The algorithm automated testing will give us a start on automating module testing. We want to leverage ideas from around the world, academia and industry.

Question from the audience - what other country has signed up for this? So far, none, but there is interest.

Q: Does FIPS 140-3 point to a specific version of the other documents? Yes, but worded to make it easier to update to newer versions, as needed. Given more flexibility.

Q: what's going to be the sunset of FIPS 140-2? Will likely follow something similar as to what we had before, there will be documentation to guide folks. Likely a year to submit against old scheme.

Q: What about the old IGs (Implementation Guidance) documents? Will they go away? About 50% of them, the rest will need to be updated.

Q: Why are we starting with the 2012 draft, and not the 2015 draft? With the mandate to update standards every 5 years? FIPS 140-3 won't have to change, we can update what it points to. We will forever be FIPS 140-3, pointing to the 'latest' ISO standard.

Q: How often do the ISO standards get updated? Every 5 years?