PASS Summit 2015: Women in Technology Luncheon

It’s Thursday at the PASS Summit so that means it’s time for the Women in Technology Luncheon.  As in years past (I’ve lost count of how many), the luncheon is sponsored by SQL Sentry.  The SQL Sentry team is here at Summit in full force, and I have Allen White at the blogger’s table with me.  But while I’m at it, let me give a shout out to a few members of the SQL Sentry team that have been supportive of not just this event, but of myself and some fellow colleagues.  These gentleman have provided feedback, suggestions, and good old fashioned support whenever asked or needed.  Thank you Aaron Bertrand, Kevin Kline, Nick Harshbarger, and Greg Gonzalez for all you do for me, my colleagues, and this community.

For those of you at home, you can watch the luncheon live on PASSTV.  Finally, if you want more rapid-fire commentary from the luncheon (as I’ll refresh this post every 5-10 minutes), I recommend following Mark Vaillancourt on Twitter (@markvsql).

Today’s luncheon features guest Angie Chang from Hackbright Academy, the VP of Strategic Partnerships, and we start with PASS Board VP of Marketing Denise McInerney welcoming us to today’s lunch (it’s the 13th one).

Angie starts by talking about her path from undergrad to her position today.  She started the Girl Geek Dinner chapter in San Francisco, and Hackbright sought her out to help celebrate the first graduating class of Hackbright.  Hackbright has graduated around 300 women over the past 3 years, and a few of those women now hold technical management positions.  Hackbright was started by some women who attended a coding camp.  The group started with an experiment of 12 women, teaching them to code in 10 weeks.  Since then they have grown the classes and the curriculum has evolved.  Right now teaching Python, and also teach some Java, Angular – they are taught to learn not just the language, but also ask questions.  Each engineering fellow has three mentors.  There are 100 software engineers who mentor those students for one hour a week.  This mentorship helps enhance the experience, and the students also get to visit other technical companies (e.g. Twitter, Dropbox).

Hackbright uses pair programming.  The community aspect is important – particularly because it’s an all-women environment.  The environment is very casual.  The students at Hackbright are very diverse and come from a variety of backgrounds.  Hackbright has a high rate of job placements.  Angie highlights some graduates of Hackbright who have been promoted to engineer management positions within their company.  SurveyMonkey has hired the most “Hackbright’s” of any company and one of the engineers is a manager there now.

Hackbright works with partner companies by inviting them to career day events and the Hackbright graduation.  Facebook sponsors a scholarship once a quarter, and Denise’s company, Intuit, also provides a scholarship.  Girl Geek Dinner started in London in about 2006, and Angie was working at a startup at that time.  Angie started up the Girl Geek Dinner in Mountain View, sponsored by Google – they had 400 people in 5 days.  They are booked into 2017 for dinners, with 2-3 per month.

Denise shifts to talking about the pipeline problem.  One Hackbright instructor, Rachel Thomas, wrote a post, If you think this is a pipeline issue then you haven’t been paying attention.  The article has suggestions for how to improve the pipeline – it’s not about getting women in, it’s about retaining them.  Denise asks Angie if she feels retention will be an issue for those graduating from Hackbright – and Angie states that they create a good network for each graduating engineer – their classmates at Hackbright, their mentors, etc. which gives each person a set of resources to turn when they’re struggling.

If you have questions you can come up to the microphone or use the #passwit hashtag on Twitter.

Documentary from Technovation called Codegirl which will stream on YouTube from November 1-5, check out the trailer.

Want to see if you have any unconscious biases?  Check out these tests on Harvard’s site.

PASS Summit 2015: Day 2

8:20 AM

We’re off and running with Adam Jorgensen, PASS EVP of Finance.  Adam takes this opportunity to provide an update about the financial status of PASS as this satisfies the requirements of the by-laws.  The largest source of revenue is the PASS Summit (not a surprise), bringing in just over 7 million dollars (of the 8 million generated in the 2015 fiscal year).  The finances continue trend upward, which is great.  Finances support the community through events all year long.  This year, 78% of every dollar taken in goes back to a community program.  PASS is in great financial health, increased reserves to 1.14 million dollars.  Starting this year, portfolio-level budget summaries will be published, to make the process more transparent to the community.  Last year goals for 2015 were to focus on support for SQLSaturdays and Chapters, among others.  PASS Summit will be in Seattle through 2019.  SQLSaturday website was relaunched this past year to help better support the events.  This year, goals include the BA Community Portfolio, refocus investments to community profiles, global growth program, sales portfolio, technology investments (including a re-design of sqlpass.org ELS: this makes Jes happy).  Adam wraps up by thanking Amy Lewis, outgoing board member.

8:33 AM

Adam finishes up and EVP Denise McInerney comes on stage.  Denise takes a minute to thank Bill Graziano, who is the outgoing Immediate Past President.  Bill has been a member of the board for 10 years. ELS: I’m personally a big fan of Bill, I worked with him on the NomCom.

Denise moves on to the PASSion Award.  There were 71 Outstanding Volunteers this past year.  This year’s PASSion Award goes to Lance Harra.  He runs the Kansas City SQLSaturday and was an integral part of the program committee.  If you are interested in becoming a part of the SQL Server leadership team, stop by the Community Zone this week.  There are always ways to get involved with PASS.

There are over 150,000 members of PASS.  There are 3000 people from over 95 countries tuning in live.  Yesterday PASS introduced foundation sessions, which were offered by Microsoft (four of them yesterday).  Over the years PASS has grown its offerings to meet its members needs – virtual chapters, 24 Hours of PASS, SQLSaturday, user groups, and more.

Today is the Women in Technology lunch (11:45) sponsored by SQLSentry, and the keynote speaker is Angie Chang.  It will be live streamed on PASSTV.  Today is the Board Q&A at 3:30.  Tonight is the Community Appreciation Party at the EMP Museum at 7 PM.

PASS Summit next year is scheduled for October 25 – October 28 – early bird pricing is available!

Today’s keynote speakers are Dr. David DeWitt and Dr. Rimma Nehme. (ELS: TWO OF MY FAVORITES!!!) They are both at the Microsoft Jim Gray Systems Lab in Madison.  Data Management for the Internet of Things.

8:45 AM

Dr. Nehme takes the stage.  She mentions that it’s harder to present a keynote together than individually.  She will start, Dr. DeWitt will come in, then Dr. Nehma will wrap up (dessert!).  What, why, how and of IOT.

Disclaimer: not announcing a product.  Goal is to inform, educate, and inspire (and entertain a bit).

Wants to begin with a new reality.  Things around us have a voice that can communicate to us.  IOT is a collection of devices and services that work together to do something useful.  Basic formula: take a basic object, add controller, sensor and actuator, add the internet, and then you get the internet of things.

Take the sensors and actuators, add connectivity and big data analytics, and then you can provide new services and optimization.  The target is to create value (make money).  What does that typically look like?  Collect data from sensors, aggregate it, analyze, then act on it. This is a continuous loop.  There are 2 types of IOT that people agree upon.  On one side have a consumer internet of things – things that are wearable, related to us as humans (phone, watch, etc.) then have things that are industrial (cars, factories, etc.).

Consumer IOT: fitbut, Nest, Lumo.  What can they reveal about us?  Health info, house information, driving habits.  You can analyze that information and make predications/revelations.  The Industrial Internet of Things (IOT) can be connected, and then significant value can be realized, particularly in Industry.  It is still in its infancy.  There are four types of IOT capabilities: Monitoring, Control, Optimization, Autonomy.  The analogy of this to human development..  We are in the “terrible twos” of the IOT development.  Why IOT?  We are at the peak of the hype right now (based on Gartner).  There is a growth of “things” connected to the internet.  In 2003 had about 500 million devices connected to the internet.  Have 12.5 billion by 2010.  Around 2008, the number of things connected to the internet exceeded the number of people.  In 2015, at 25 billion things connected to the internet.  The value to customers is huge.  The power of 1% – if you can improve 1% in fuel savings in an industry like aviation, health care, or power generation, that’s extremely significant.

Why is this happening now?  More mobile devices, better sensors and activators, and BI analysis.

For IOT How?, Dr. DeWitt comes on stage.  Dr. DeWitt is going to talk about the services available.  There are a lot of challenges – a large number (and variety) of sensors.  There are A LOT of devices sending data.  Sensors are frequently dirty, and it’s hard to distinguish between dirty readings and anomalies.  And then there is just the volume of data that’s being sent into the cloud.  One of the biggest challenges is device security.  How do you prevent them from overwhelming cloud infrastructure or impersonating a device?  And then there’s cloud-to-device messaging.  Sometimes the device is not online.  Therefore the device may miss a message, so persistent queue and reliability is needed.  How do you deploy this and get the IOT set up?  We’re not going to tackle that today.

There are differences between consumer and industrial IOT.  In consumer IOT have to worry about battery and power failure, more cost-sensitive, and might be a simple embedded device, or it could be a powerful sensor, and finally, consumers have wireless (industrial has unlimited power, full-fledged, wired, and depends on needed functionality).  Rest of talk will focus on industrial.  Note: one size fits none.

Today’s IOT: Just Do It Yourself.  The state of the art is still rather primitive.  What are the ingredients that go into IOT?  The basic block diagram, out in the field you have devices with a sensor and actuator (e.g. sense temp, humidity, in a Nest thermostat).  Up in the cloud, have event/data aggregator.  Device to Cloud (D2C) is how the data gets from the device up to the cloud.  You can feed this data into an application, into event/data storage, into a real-time processing engine (real time), and that *can* use a device controller and send it back to the device (C2D = Cloud to Device).  Azure IOT services exist.  Two main components: Azure Iot Hubs and Azure Event Hubs.  The data management is done through Azure Stream Analytics, DocumentDB, SQL Azure and SQL-DW, Azure HDInsight and Azure Machine Learning.  and then use PowerBI and Excel to visualize the data.

Azure IOT Hub (an Azure PaaS Service), this is the cornerstone of IOT.  It receives events and routes them.  It is scalable to millions of devicees, and it provides per-device instance authentication.  It can send commands back to the devices.  Within the hug, every device has it’s own send endpoint, to which the sensors will send events.  On the output side, is a set of partitions, into which data gets routed.  The number of partitions is created when the service is created in the cloud.  A hash function routes it to a partition.  Event consumers then “pull” events from the Receive EndPoint.  There is a C2D Send Endpoint that can send messages out, and then get routed to a message queue that guarantee once delivery out to the device’s actuator.

One thing you can do with events is pull them out of the IOT HUB and they go to the Event Consumer such as SQL Azure (doesn’t have a nicer sexy symbol like SQL Server), into HDFS, into Azure Storage, or into DocDB (these are examples).  Analyzing the events, then, can be done via SQL Server, or use SQL-DW and Polybase, Hadoop from HDFS (or Hvie/Storm), or DocDB.  All of these are great opportunities to store events.  A neat thing to do with IOT data is LEARN from it (e.g. when the boiler might explode).

Options for real-time query engine include Azure Stream Analytics or Apache Storm on HDInsight.  What’s a real-time query engine?  Traditional RDBMS with data on disk, send in a query, get data back.  In Dr. DeWitt’s mind, the real-time streaming is taking a sequence of events, and some queries that will operate over those events, and the query will find IDs of boilers that are about ready to explode based on PSi.  As query processes stream events, it will eventually produce results.  Can have multiple queries operating over the same set of events, or different streams.  Dr DeWitt encourages us to learn about stream analytics.

There is no data stored, the queries are just continually running, data flows through the query, outputs results.  When you see something important, what do you do?  Send a message to IOT hub to do an action (e..g open pressure release valve).  Field gateway – Raspberry Pi, running Windows 10, has WiFi – that’s a field gateway.  There are two primary use cases: when a sensor/device cannot itself connect to the internet, or for complex objects (e.g. smart cars) with multiple sensors/actuators.  Two flavors: opaque (only field gateway has identity in IOT hub) and transparent (each device is registered in IOT hub.  The field gateway are processors with memory and processors.

How to manage IOT metadata?  per-device metadata is not stored in a database system at present time so no query support.

Device security is super critical for IOT deployment.  Devices must have unique identities, and must PULL to obtain C2D commands (no ports open to reduce attacks).  Main takeway: it is PUSH to the cloud.  All the IOT events get pushed up into the cloud.  It was a good first effort.  But what are the problems with pushing everything to the cloud? Not enough bandwidth, requires connectivity, latency, data deluge (from boring sensor readings), storage constraints (storing EVERY event), speed, main point: wastes network bandwidth, computational resources, storage capacity and bandwidth processing for NON-INTERESTING events.

Go back to boiler example…Running the same query over and over, waste bandwidth sending the reading every second.  Centralizing all data from multiple systems might overload the system.  Here is their insight: exploit the capability of the field gateway.  It can do local processing and control.  Have the boiler with sensor and actuator.  Then you have a field gateway, and in that, going to run a streaming database system, and install on that boiler gateway control program, and run data through.  If run streaming engine there, can run any number of queries, might send average pressure reading for 60 seconds of data up to the IOT.  This is a better approach – reduce what pushing up to the cloud, and what needs to be stored.

How can we do better?  Dr. Nehme comes back on stage…  (she has changed her outfit…but don’t tweet about it…she’s a jeans and tshirt girl (I KNEW IT)

Fog computing – all about computing on the edge.  It is not cloud vs. fog, it is cloud + fog.

What’s the fog?  It’s like “predicate pushdown”.  Never move the data to the computation, move the computation to the data.  Devices perform some data pre-processing and compression, the cloud is a big gorilla that can do the management, processing, and machine learning.  How can we do better?  Real-time response, scalability, metadata management, GeoDR of IOT hubs.  IOT is a database problem, not just a networking problem.  It hasn’t been database-centric before, but trying to address that.

Want to take existing IOT Azure services and expand on them.  Proposing Polybase for IOT (not a product announcement, just an idea).  What is vision? Declarative language, complex object modeling, scale able metadata management, discrete and continuous queries, multi-purpose querying, computation pushdown.

Declarative language: if dealing with IOT, only choice is to use imperative language.  Have to explicitly specify how you want to see something.  What about IOT-SQL?  A declarative language where you can select information from the sensors.  If have tables specified as buildings, room, temperature sensors, etc.  With temperature sensors, have columns that looks like regular database.  Need to figure out how to model complex objects – for example, a room on a floor in a building, – need a model for this.  Have a notion of a shell database – it is a regular database that stores metadata, statistics, and access privileges – can perform authentication, authorization and query optimization against that database.  As far as these processes are concerned, they don’t need the actual data.  Now expand this to the devices.  The IOT shell also gives a simple abstraction for sensors, actuators, and distributors.  The shell can be stored in SQL Azure, DocDB, etc.  It’s JUST a database.

What about querying devices?  One query is ExecuteOnce: push select to device, it sends results, we’re done.  ExecuteForever, push SELECT to device, then the device continually sends results back to client.  When done, send signal we’re done and query stops running.  Then have ExecuteAction: send a SELECT and then an action, and the action gets fired when predicate is met.  Can do execution once, or forever.

Back to temperature sensor table…need some delcarative queries.  ExecuteOnce – get the count of all hot locations.  The optimized plan is generated, data is moved, and then work is done up in the cloud.  Not a lot of pushdown here.  ExecuteForever query – record all hot locations up in the cloud, and execute forever, the optimizer might produce a different plan (does some partial aggregation before pushing data up into the cloud – larger computation is done in the “fog”).

ExecuteAction: turn on AC in all the hot locations.  Larger computation and the action is pushed down in the fog, and only interesting events are pushed up into the cloud.  Multi-purpose query – based on results, some could go to one location, some could go to another location.

The Polybase for IOT Wrapup – use SQL front end with Polybase for sensor/actuator metadata management and querying.  Exploit Polybase’s external attribute mechanism to allow SQL queries to reference sensor values…and then one more thing I didn’t get 🙂

Why should we, as data professionals, care?  When a new technology rolls over you, you’re either part of the steamroller or part of the road (didn’t get the attribute).  Key takeway: the amount of data to manage is exponentially going up.  Need to step back to see what success looks like.

Dr. Nehme has announced that this is their last keynote.  Why? Dr. DeWitt…they have done 7 of these.  There are a lot of great speakers at MS, and he is sure there are people who are better speakers.  Dr. DeWitt and Dr. Rehme are “parting ways”.  She is finishing up her MBA and moving on.  Dr. DeWitt is starting to think about retirement.  After 40 years thinks it’s about time to give up the full time gig.  In 10 years…  Have not seen the last of Dr. Rehma – whether it’s at Microsoft or at a competitive.  Dr. DeWitt says this has been one of his brightest spots in his career.  He says it’s been a terrific experience.  He will think about this community for many years to come. (ELS: I admit, I’m a little teary.)

PASS Summit 2015: Day 1

Well friends, here I sit again, at the blogger’s table, ready to kick of Day 1 of the PASS Summit here in Seattle, Washington.  My trusty side-kick, Perry, is with me as usual, and we’re joined by Bunny this year as my 8 year old insisted I am bring them both. Who I am to argue with her?

Some notes about today

It’s National Chocolate Day!  I plan to celebrate all day 🙂

I present twice today!  My first session is right after the keynote: Kicking and Screaming: Replacing Profiler with Extended Events in Room 6A from 10:15 to 11:30 AM.  Note that this session was a little later in a different room than originally scheduled.  My second session, Statistics and Query Plans, is from 3:15 PM to 4:30 in 6B.  I hope to see you at one of my sessions, feel free to come up and say if we haven’t met before (or if we have!).

Today’s keynote, Accelerating Your Business With a Modern Data Strategy, is headlined by Joseph Sirosh who is a Corporate Vice President in the Data Group at Microsoft.  And we’re off…

8:21 AM

Up first today is PASS President Tom LaRock.  This is Tom’s last year as President, he’ll next step into the role of Immediate Past President.  He mentions #SQLFamily and says that everyone is free to give him a hug.  That could be a lot of hugs.

Attendees from over 58 countries…over 2000 companies are represented, and the Microsoft team will be everywhere this week – stop by the SQLClinic if you have any questions you need help with.

For those of you not here, please follow along on PASS TV, just head over to the main page for Live Streaming.

Tom introduced the PASS Board of Directors and encouraged members of the community to talk to the board this week to help them understand how to serve the community better.  Tom mentions the Board Q&A on Thursday at 3:30 PM in 307-308.

There are 5,500 total registrations this year for Summit (note: that’s not individuals…if you register for a pre-con and the conference, I think that’s 2 registrations, not 1).  Tom asks for a show of hands from newcomers…there are a lot. ELS: Those of you who are here for the first time, try to meet people!  If you’re an introvert, I know that’s hard, but take a risk!  Say hi, find something in common!

The SQL community is the gold standard for technical communities. ELS: I don’t disagree, I have friends in other technical disciplines, and they have nothing like what we have.

There are over 200 sessions and workshops this week.  Use the mobile app Guidebook to stay on top of any schedule changes.  On Twitter follow along with hashtags #sqlpass and #summit15.

The Birds of a Feather lunch will take place during lunch on Friday where you can talk to people with an interest in a specific feature/area.

Don’t forget out Sponsors who make this entire event possible.  There are some fantastic companies that support the SQL Server community.  PLEASE make time to go talk to them this week.  The Exhibitor Reception is tonight, after regular sessions end.

Tom closes by saying how proud he is to be a member of the #SQLFamily community.  He’s been a member since 2004.  I think he’s getting a little choked up.  Oh.  HUGS TOM!

 8:37 AM

Joseph Sirosh takes the stage.

We live in an age of data.  The ability to extract that data and use it is changing our daily lives.  All of the worlds data was analog 30+ years.  Then we got DVDs and such which started to digitize data.  When the internet came along, data suddenly had an IP address.  Connected data can be moved around and joined with other connected data.  Which means you extract intelligence from it.  The vast majority of today’s data is digital.  Much of that in the cloud.  Fast forward to 2020, there will be 50 million petabytes of data, mostly in the cloud.  Fifty years ago, hardware drove new customer experiences.  Then came the age of software.  Digitizing everything.

In the new world, data will predict everything.  We can use this data to develop models so that when, for example, people come in to the ER with a problem, you can put in data collected and use a model to determine a path of care.

Joseph brings up Eric Fleischman who is the Chief Architect and VP of Platform Engineering at DocuSign (we use their site!), they chose to use SQL Server because they believed Microsoft would be there for them (and they have been).  They made an investment into the telemetry of the system that process millions of data points about the performance of the actual system.  That system is scaling literally to the OLTP system.  There are some improvements in the HA/DR stack for them in 2016, along with the encrypted features.

SQL Server 2016 is meant to be the all engines of data that you can build your data on – both in house and in the cloud.  Innovate first in the cloud with an accelerated speed (push new code once a week).  The pain in this system translates into changes in software very quickly.  When you build and operate in the cloud, you take innovation and bring it back to packaged software (SQL Server 2016).  Companies like Oracle cannot claim that…who build locally and then ship to the cloud.  Amazon will state they are only in the cloud, it’s a cloud-only feature. But “we” know better.  Feet on the ground and head in the cloud.  You have to build products to operate both in the sky and the ground.  Microsoft is the only company to do that.  This community is making Microsoft number 1 in the age of data.

HUGE STATEMENT from Joseph.

There’s a video with some feedback from fellow MVPs about SQL Server 2016…  Joseph turns the stage over to Shawn Bice (General Manager, Database Systems Group).  Haven’t shipped 2016 yet, but it powers everything in the cloud.

Seven big bets…all of these are built-in.

From OLTP perspective – SQL Server is recognized as a leader.

SQL Server is the most secure database.  SQL runs some of the most scalable data warehouses in the world.  Mobile BI is built into SQL Server, it’s about that mobile workforce and getting visualizations to them.

First big bet: HA/DR.  Have learned a lot from partnership with DocuSign.  DocuSign is using some of the fastest IO subsystems with FusionIO.  Have A LOT of data moving across the wire to secondaries.  Have done a ton of work with algorithms to improve updates.  Have customers that use Azure along with on premise all the time.  You can enroll an Azure DB with an on-prem system to create a DR site.  For all of you that have used DB Mirroring: want to use an AG but can’t domain join it.  In SQL 16, can stand up HA environment, don’t have to domain join anything.  Woohoo!  Introduce load balancing around read scale, so don’t have to point clients to every secondary.  The stack for on prem is the same that’s in Azure, and they do failovers every day.  They had a data center that was on fire and failed over every customer in China in about 5 hours.

Ok, I need to get to my first session, I’ll be back tomorrow!  Have a great day!