Monday, October 11, 2010

GHC10: Friday Keynote, Barbara Liskov, Another Perspective

I did not originally blog on Dr. Barbara Liskov's Friday morning keynote, but found while writing up my trip report that many of the things she mentioned had really stuck with me so I wanted to share with a wider audience.

First of all, Dr. Liskov was an amazing and energetic speaker - enough to keep 2000 jet-lagged women wide awake through an intense technical walk through the history or structured programming languages at 8:30 in the morning. Fascinating and inspiring!

My notes mostly come from my twitter feed, as well as Teri Oda's, and the Grace Hopper Conference wiki. Hope you get something from them as well!

Friday morning was full of extreme technical talks, beginning with the 8:30 AM keynote from Barbara Liskov, Professor at MIT and 2008 ACM Turing Award Winner.  Dr. Liskov regaled us with the evolution of programming languages by describing a series of must-read papers and the advances she made to this are of the science.  She started in computer systems, and in those days, it was the job of the programmer to make up for the lack of
system resources and under provisioned systems.

Dr. Liskov's advice:
  • "Reading programs is much more important than writing them." (she notes people will be reading your program for years to come and you only write it once - comment!)
  • "Don't try to work on a problem when you get too tired. The solution won't come to you until you're rested."
  • "Programmers think in terms of programming languages...if the language supports and idea it's much more accessible to them."
Dr. Liskov's recommended papers:
Dr. Liskov was a pioneer in computer language development. Many of the concepts she was discussing with her peers in the 1970s are just now appearing in modern languages. When asked what her advice was on the best "first language", she said "Python is used a lot, but lacking features we
want students to learn. C# and Java have those, but are harder to learn."

[Update: Thank you, Kelly, for the additional papers!]

Friday, October 1, 2010

GHC10: The Power of the Purse: Making Our Collective Voices Heard

The panel started out with some great slides that showed how much more women use technology than men. Women make 70% of the consumer buying decisions, women dominate higher education (140 degrees per 100 for men), and women are more likely to work in health care and education, slightly more resilient to economic swings.

The panelists include Kathleen Naughton (HP), Cathy Lasser (IBM), Wei Lin (Symantec), Divya Kolar Sunder (Intel), Vidya Dinamani (Intuit) and Patty Lopez (Intel).

Vidya said that Intuit has done a lot of lab and in home studies about financial behaviours, and they find that independently men and women behave similarly, but when they are put together to work on things like taxes they see that men are very quick to answer questions while women take the time to understand the questions and make sure they are answering them correctly.

Divya, a recent new mom, talked about shopping for baby products and how troublesome it was to find a diaper bag that worked for both her and her husband and baby bottles that seemed more like mom.  She found reading blogs from other mothers, who seem to naturally want to share their experiences, help her find what she needed.

Cathy, from IBM, is actually researching how people shop online. Some of the things they have found is that people are much less likely to return items they've bought online, which makes sense as it's harder. One thing they thought women might like was to have an avatar of sorts so they can see how an outfit would look on them, but it turns out most women don't want to see a 3D image of themselves, so it actually discouraged purchases.  An audience member said she felt similarly about shopping in stores, that she didn't like how clothes looked there, but did at home, so she preferred shopping online.
I can get that - it seems many stores always have really awful, harsh overhead lighting that, even when I was a skinny teenager, made me look awful.

A few of the panelists than discussed their thoughts on online retailers doing data mining, mostly saying they are comfortable with this occurring as it so greatly improves their shopping experience.  There is some concern that the retailers need to store this and use it in a safe manner, though there doesn't seem to be a good way to check this and currently no standards to protect the consumer. Wei, who works in security at Symantec, disagreed. One of the behaviours she has witnessed that she finds disturbing is when you shop for a type of item at one online retailer and they go somewhere else, you'll get an ad for that item. It's not clear to the consumer if this is a legitimate service or spyware.

Cathy said that a lot more companies are listening to feedback from their customers to redesign things - like NorthFace jackets and providing covers for cell phones to brighten them up so they can be found in purses!

Wei wanted to share some best practices with us: never give out your password, never give out personal information, never open a link or attachment from a stranger, change your password (personal one, too) frequently, use malware and virus detection software from trusted sources, and don't use a debit card for online shopping.  Wei also recommends getting a password wallet to help you manage all these passwords, so you can frequently change your password. I would caution you to be careful when choosing such software, as it can also be malware, too! You don't want to make it too easy for the hackers! :-)

There seemed to be a lot of questions about security and best practices for privacy on the Internet, so perhaps the Grace Hopper Conference needs a security and privacy track next year! :-)

GHC10: Fighting Cyber Crime: Technology that Fights Crime and Protects Our Children

You have a 6 in 10 chance of being impacted by cyber crime, yet people worry way less about this type of attack than they do about snake bites or getting struck by lightening. Rhonda Shantz, from Symantec, is concerned about this general lack of concern. Other panelists today include Cristina Fernandez (National Center for Missing and Exploited Children), Sarah Seltzer (Microsoft), Les Nichols (Boys and Girls Club of America), and Erica Christensen La Blanc (CA Technologies).

[TRIGGER WARNING: Some of the content below, which has to do with exploited children, may make some readers uncomfortable or bring up painful memories. Please proceed with caution.]

The cool thing about these panelists are their incredibly diverse backgrounds that brought them all into areas that protect children. For example, Les was an architect (not in the sense that we think of in the software industry, but rather the type that designs buildings) and Erica started out in television!

Taking us straight to the facts, the panel lets us know that 62% of children are having some sort of trouble online (sexual predators, bullying, stalking, virus, malware) and only 45% of parents know this.  WOW! According to the National Center for Missing and Exploited Children, pimps are using social networks to try to recruit children and others into their prostitution rings. Only about half the children who are exploited online report to their parents, because they are afraid if they do tell, they will lose their Internet access. No matter how terrible it is being exploited or harassed online, it's not worth it to them to report because the typical parental response is to take the child off of the Internet. It's hard to imagine how important Internet access has become to our children - definitely something for parents to keep in mind.

The Internet, which makes all of our lives easier, has unfortunately made it 'safer' for pedophiles to get access to exploitive material and connect with other pedophiles that they can trade material with (peer to peer networking gone bad). Now technology companies like Microsoft, Symantec and CA are looking for technological systems to find inappropriate images, shut down servers and find the predators. While I've always associated groups like the National Center for Exploited and Missing Children with working on this issue, it is heartwarming to discover some really large businesses are helping to find these disgusting criminals.  The agencies that focus on children, unfortunately have little technology experience and have come to rely on these other companies to help them bridge the gap to protect children.

Norton provides a tool called Norton Online Family for free, which aims to help parents protect their children without overly restricting the child's access to the Internet. Boys and Girls club of America has My Club My Life for teens and Net Smartz, but that does require the children to voluntarily give up some of their online access but they are seeing children willing to do this.

Microsoft is working with Dartmouth on PhotoDNA, a fascinating piece of software that can identify inappropriate photos and permutations (resized, cropped, etc) in other places and help server admins take them down and find the perpetrators.

This is a truly frightening area for our youngest generation, and I'm glad to see some really brilliant people working on this!

GHC10: Computational Sustainability: Computational Methods for a Sustainable Environment, Economy and Society, Carla P. Gomes

Professor Carla P Gomes, faculty of Computing and Information Science and director of Institute for Computational Sustainability, is a pioneer in the field of computational sustainability.

In 1987, there was a UN report that first raised concerns about human impact on the planet. A follow-up report showed things like the biomass of fish is 10% of what it was 50 years ago.  We're over harvesting our planet and overusing our resources.  A 2009 report looked at whether or not we've crossed the tipping point, and it was looking grim. All these things inspired Professor Gomes to do further research in this area to see what we could do to help reverse the tide using the field of computer science. She strongly believes that computer scientists can, and should, play a key role in increasing our efficiency of managing natural resources.

Computational sustainability encompasses many disciplines like economics, sociology, environmental sciences and engineering, biology, crop and soil science, meteorology and atmospheric science.  There is a need to develop computation methods to model things in these fields, which will help resolve these problems.  This cross discipline model helps all fields learn new research models from each other, which is helping things in this area to progress.

One problem this field is addressing is wildlife corridors, which link biological areas allowing animal movement between areas. One of the issues here is that, while important for the the animals, there isn't usually much money available to buy land, etc, to set these corridors up so that animals in different national preserves can cross populate.  This is a computational problem - need to find the graph that has the best and cheapest path between the two places. While this is an NP hard problem, the computer scientists can simplify the problem by using the Min Cost Steiner Tree. Models are critically important in solving these problems and for addressing the issues of scale.

This approach allows them to handle large problems and reduce corridor cost dramatically, allowing the projects to actually proceed as opposed to being ignored or done with too much expense or in a sub-par fashion that won't help the animals as much as possible. Her work has been done for grizzly bears and wolverines.

Now she is working on assisting the recovery of a subspecies of woodpecker, by analyzing network cascades. They are buying up the land where the birds fly, then looking at the birds flight patterns and buying nearby land, which will help the birds spread their territory which will lead to increased population. The complicated issue is figuring out which land the birds will choose to spread to.

Further consideration is necessary for species interaction, as not all species interact in a cooperative manner.

They are getting help from the eBird project, at Cornell, which allows average folks to submit data about bird sightings. This helps them to learn where the birds are migrating and how long they spend in various areas.

Many of these concepts can also be applied to analyzing solutions to problems fought by very impoverished communities. For example, what will be more valuable to the impoverished? A chicken, improved roadways, or providing cell phones?

Back to the problem of over fishing, it seems to be caused by mismanagement. Professor Gomes is looking at models to help correct this mismanagement without causing any additional problems. Even after they figure out recommendations they need to get the fisheries to implement them. It is difficult to convince fishery owners that periodically closing the fisheries will actually lead to more fish when they reopen - you gotta give them time to reproduce and reach reproductive age!

Another thing her team is studying is the impact of fertilizers. While they do greatly increase the amount of food that can be harvested, they end up creating dead zones. On top of all that, they are also studying how to discover materials for fuel cell technology! These, again, Professor Gomes claims are problems for computer scientists.

Professor Gomes's research area is so incredibly broad! She shared with us, more quickly than I could capture, many of the different algorithms and approaches they are using to solve these problems. I got a great mini-introduction to all sorts of algorithms and data structures I'd never heard of before, like a spatially balanced Latin squares! She is an amazingly energetic, intelligent and passionate technical speaker and I think I could spend an entire day listening to her!

GHC10: Anita Borg Technical Leadership Award Winner Laura Haas

Laura Haas, IBM Fellow, has been recognized by the Anita Borg Institute and the Grace Hopper Conference for her outstanding contributions to technology. I am so happy to be here to hear her talk today on information integration!

Haas and her team are trying to tackle the problem of how do we get information to people when they need it? For example, if a doctor is treating a patient with cancer, she will need to find information on how this type of cancer has been treated in the past, how well the treatments have worked and access past patient records.

The challenges faced are that you have diverse data models, overlapping data, incomplete and often inconsistent data. Different people involved want different views of the data and needs and knowledge change over time.

In order to do data integration, you need to understand what is available, as well as what the data means or its intent. You have to set up the schema, figure out how to identify information about the same object and figure out what to do with missing or inconsistent data. You need to decide which problems you're trying to solve, and execute - and hope the customer doesn't come back and tell you that they really wanted something else entirely :-)

Dr. Haas started her career in 1981 at IBM and relational databases were just coming onto the scene. You no longer had to be a database wizard to write code to interact with a database - which broadened the concept of information integration. They even called it "eager integration" - as you could eagerly get as much data as you wanted.

She then started her work on the project R* (pronounced R-star), which was a distributed relational database management system. One query was allowed to access data in multiple, homogeneous relational DBMS. This type of system helped prevent data loss and helped to distribute queries and transaction management.  While the project did not have much commercial success, it paved the way for a lot of work in database systems and future products for IBM and her own future research.

Relational database technology was growing rapidly in 1984, a very exciting time for those in the industry.  Dr. Haas then joined the the Starburst team (no, not named for the "fruit" chews, but named as an extension of the R* project).  This was an extensible relational DBMS that allowed many types of additions - new functions, optimizations, indexes, data types, and storage methods. The best part of this? This project had legs - and became foundation for IBM's DB2 "for workstations".

Several people that worked on this project ended up being named Fellows or Distinguished Engineers, though she notes it took her a lot longer to get Fellow than her male colleagues and she had to earn many more accolades.  Dr. Haas recommends that you wrap yourself with the best team you can find, do not be intimidated if they are better or smarter than you are, as they will take you places!

Dr. Haas was able to take a sabbatical from IBM to study at the University of Wisconsin-Madison, where she studied with the brightest minds in database technology at the time (1992).

One of the new problems that needed to be solved in 1993, when she returned to IBM, was how to store images, videos and text that were starting to proliferate online.  Digital libraries start to emerge in this time frame and they eventually will leverage relational DBMS.  Customers were starting to want databases that could store multiple data types, so Dr. Haas and her team went to look back at concepts from R* and Starburst to solve the problem and started a new project... Garlic.

Why Garlic? Because Dr. Haas doesn't like acronyms, which IBM was famous for at the time, and she loves to cook. Garlic and chocolate being her favorite things - her old team thought if they renamed the team/project to Garlic, they'd get her to come back off of sabbatical. It worked!

Garlic was a data-less (object-)relational DBMS (aka virtual DBMS/federated DBMS). Had all the benefits of a high-level query language and all the features of the underlying data sources.  This not only became a product for IBM, but started two separate business units (Life Sciences and InfoSphere Information Integration).  Something that is very obvious listening to Dr. Haas speak is that once you find people you like working with - stick with them. You can do amazing things!

IBM was having trouble with integration, as people working in life sciences that were trying to work together wouldn't use the same database as their colleagues, so Dr. Haas's team worked on something called InfoLink to attempt to bridge this gap. Unfortunately the project was not a market success, but did help get IBM in the door at new customers and led to the InfoSphere suite - "a complete line of products for all your integration needs."

The longer Dr. Haas was at IBM, the larger her teams got - from a 10 person research team to a 120 person development organization and eventually to over 700 people (no team picture for that group... :-)

While this all sounds wonderful, there was still major problems that needed to be solved in 1999. As more people were adding federation to their systems, issues emerged. Set-up of federation was too slow and complicated, and while the development team had assumed users would be doing very simple joins/queries, but it turned out that complex queries were more the norm.

This lead to yet another project, Clio (not an acronym!) to do schema mapping by simply drawing lines! This opened up many more doors for IBM in the DBMS space and gave the researches many more ideas for future projects.

What impressed me most about Dr. Haas was the importance she gave to her team. She was so proud of each and every person she ever worked with, remembered their names and knew all about what they were doing now. Dr. Haas is clearly an amazing collaborator and it's not surprising that these brilliant people want to work with her.

What a phenomenal technical woman, very deserving of the Anita Borg Technical Leadership award!