Tuesday, December 2, 2014
Guggenheim 101 (Lees-Kubota Lecture Hall) – Guggenheim Aeronautical Laboratory

PUSD: Annual Open Enrollment

New Center Supports Data-Driven Research

With the advanced capabilities of today's computer technologies, researchers can now collect vast amounts of information with unprecedented speed. However, gathering information is only one half of a scientific discovery, as the data also need to be analyzed and interpreted. A new center on campus aims to hasten such data-driven discoveries by making expertise and advanced computational tools available to Caltech researchers in many disciplines within the sciences and the humanities.

The new Center for Data-Driven Discovery (CD3), which became operational this fall, is a hub for researchers to apply advanced data exploration and analysis tools to their work in fields such as biology, environmental science, physics, astronomy, chemistry, engineering, and the humanities.

The Caltech center will also complement the resources available at JPL's Center for Data Science and Technology, says director of CD3 and professor of astronomy George Djorgovski.

"Bringing together the research, technical expertise, and respective disciplines of the two centers to form this joint initiative creates a wonderful synergy that will allow us opportunities to explore and innovate new capabilities in data-driven science for many of our sponsors," adds Daniel Crichton, director of the Center for Data Science and Technology at JPL.

At the core of the Caltech center are staff members who specialize in both computational methodology and various domains of science, such as biology, chemistry, and physics. Faculty-led research groups from each of Caltech's six divisions and JPL will be able to collaborate with center staff to find new ways to get the most from their research data. Resources at CD3 will range from data storage and cataloguing that meet the highest "housekeeping" standards, to custom data-analysis methods that combine statistics with machine learning—the development of algorithms that can "learn" from data. The staff will also help develop new research projects that could benefit from large amounts of existing data.

"The volume, quality, and complexity of data are growing such that the tools that we used to use—on our desktops or even on serious computing machines—10 years ago are no longer adequate. These are not problems that can be solved by just buying a bigger computer or better software; we need to actually invent new methods that allow us to make discoveries from these data sets," says Djorgovski.

Rather than turning to off-the-shelf data-analysis methods, Caltech researchers can now collaborate with CD3 staff to develop new customized computational methods and tools that are specialized for their unique goals. For example, astronomers like Djorgovski can use data-driven computing in the development of new ways to quickly scan large digital sky surveys for rare or interesting targets, such as distant quasars or new kinds of supernova explosions—targets that can be examined more closely with telescopes, such as those at the W. M. Keck Observatory, he says.

Mary Kennedy, the Allen and Lenabelle Davis Professor of Biology and a coleader of CD3, says that the center will serve as a bridge between the laboratory-science and computer-science communities at Caltech. In addition to matching up Caltech faculty members with the expertise they will need to analyze their data, the center will also minimize the gap between those communities by providing educational opportunities for undergraduate and graduate students.

"Scientific development has moved so quickly that the education of most experimental scientists has not included the techniques one needs to synthesize or mine large data sets efficiently," Kennedy says. "Another way to say this is that 'domain' sciences—biology, engineering, astronomy, geology, chemistry, sociology, etc.—have developed in isolation from theoretical computer science and mathematics aimed at analysis of high-dimensional data. The goal of the new center is to provide a link between the two."

Work in Kennedy's laboratory focuses on understanding what takes place at the molecular level in the brain when neuronal synapses are altered to store information during learning. She says that methods and tools developed at the new center will assist her group in creating computer simulations that can help them understand how synapses are regulated by enzymes during learning.

"The ability to simulate molecular mechanisms in detail and then test predictions of the simulations with experiments will revolutionize our understanding of highly interconnected control mechanisms in cells," she says. "To some, this seems like science fiction, but it won't stay fictional for long. Caltech needs to lead in these endeavors."

Assistant Professor of Biology Mitchell Guttman says that the center will also be an asset to groups like his that are trying to make sense out of big sets of genomic data. "Biology is becoming a big-data science—genome sequences are available at an unprecedented pace. Whereas it took more than $1 billion to sequence the first genome, it now costs less than $1,000," he says. "Making sense of all this data is a challenge, but it is the future of biomedical research."

In his own work, Guttman studies the genetic code of lncRNAs, a new class of gene that he discovered, largely through computational methods like those available at the new center. "I am excited about the new CD3 center because it represents an opportunity to leverage the best ideas and approaches across disciplines to solve a major challenge in our own research," he says.

But the most valuable findings from the center could be those that stem not from a single project, but from the multidisciplinary collaborations that CD3 will enable, Djorgovski says. "To me, the most interesting outcome is to have successful methodology transfers between different fields—for example, to see if a solution developed in astronomy can be used in biology," he says.

In fact, one such crossover method has already been identified, says Matthew Graham, a computational scientist at the center. "One of the challenges in data-rich science is dealing with very heterogeneous data—data of different types from different instruments," says Graham. "Using the experience and the methods we developed in astronomy for the Virtual Observatory, I worked with biologists to develop a smart data-management system for a collection of expression and gene-integration data for genetic lines in zebrafish. We are now starting a project along similar methodology transfer lines with Professor Barbara Wold's group on RNA genomics."

And, through the discovery of more tools and methods like these, "the center could really develop new projects that bridge the boundaries between different traditional fields through new collaborations," Djorgovski says.

Exclude from News Hub: 
News Type: 
Research News

Using Simulation and Optimization to Cut Wait Times for Voters

No one ever likes long lines. Waiting in line may be inconvenient at the coffee shop or the bank, but it's a serious matter at voting centers, where a long wait time can discourage voters—and can be seen as an impediment to democracy.

However, with millions of Americans showing up at the polls, can long lines really be avoided on Election Day? By developing a tool to help better prepare polling places, Caltech sophomore Sean McKenna is using his Summer Undergraduate Research Fellowship (SURF) project as an opportunity to address that problem.

Over the summer, McKenna, an applied and computational mathematics major who works with Professor of Political Science Michael Alvarez, has been building a mathematics-informed tool that will predict busy times in precincts on Election Day and allocate voting machines in response to those predictions. This information could help election administrators minimize wait times for millions of voters.

"My project is based on a report from the Presidential Commission on Election Administration, which asserted that no American should ever have to wait more than 30 minutes to vote," McKenna says. "And so we're trying to see if we can help reach that goal by allocating voting machines in a new way."

McKenna's work is part of the Caltech/MIT Voting Technology Project (VTP), which has been working on voting technology and election administration since the 2000 election. At a June workshop for the collaborative VTP project, which aims to improve the voting process through research, McKenna met with academics and election administrators who suggested how he might apply his background in mathematics to create a tool for voting administrators to use on the VTP's website.

The tool he is developing uses a branch of applied mathematics called queueing theory to quantify the formation of lines on Election Day. "Queueing theory assumes that arrivals to a system like a polling place have a random, memoryless pattern. Under this assumption, the fact that one person just showed up to the precinct doesn't tell us whether the next person will show up two seconds from now or two minutes from now," he says. "Furthermore, queueing theory predicts line lengths and wait times as long-term averages, which scientists might call a steady-state approximation."

Although queueing theory provided a good jumping off point, there were a few real-world problems that an analytical model on its own couldn't address, McKenna says. For example, voter arrival behavior is not completely random on Election Day; early morning and late afternoon spikes in arrivals are the norm. In addition, polls are usually only open for 12 or 13 hours, which is not considered to be enough time for steady-state queueing approximations to be applicable.

"These challenges led us to review the literature and determine that running a simulation with actual data from administrators, as opposed to attempting to adjust strictly analytical models, was the best way to represent what actually happens in an election," McKenna explained.

The goal of the research is to create a simulation of an entire jurisdiction, such as a county with multiple polling places. The simulation would estimate wait times on Election Day based on information election administrators enter about their jurisdiction into the web-based tool. Administrators would then receive a customized output prior to Election Day, suggesting how to allocate voting machines across the jurisdiction and detailing the anticipated crowds—information that could both predict the severity of long lines and prompt new strategies for allocating voting machines to preempt long waits.

Several other Caltech undergraduates in Alvarez's group also have been working on alternative ways to improve the voting process. Senior physics major Jacob Shenker has been developing a system for more secure and user-friendly postal voting, and recent graduates Eugene Vinitsky (BS '14, physics) and Jonathan Schor (BS '14, biology and chemistry) produced a prototype of a mobile phone app that could help voters determine if there is a long line at their polling place.

While these projects were completed separately, McKenna says there may be room for collaboration in the future. "One thing that we're hoping my tool will be able to do is to predict for administrators what times are going to be busiest, and we could also export this information to the app for voters," he says. "For example, the app could alert someone that their polling place is very likely to have long lines in the morning so they should try to go in the afternoon."

The technologies that McKenna and his student colleagues are developing could change the way that millions of Americans participate in democracy in the future—which would be an impressive accomplishment for a young student who has yet to experience the physical aspect of lining up to vote.

"So that's one kind of sticky situation about my working on this project: I've never actually been in to vote in person. I've only been able to vote once, and since I'm from Minnesota, it had to be absentee by mail," he says.

Exclude from News Hub: 
Wednesday, October 29, 2014
Center for Student Services 360 (Workshop Space) – Center for Student Services

Meet the Outreach Guys: James & Julius

Wednesday, October 29, 2014
Avery Courtyard – Avery House

Fall Family Festival

Friday, October 17, 2014
Center for Student Services 360 (Workshop Space) – Center for Student Services

TA Training: fall make-up session

The Risk and Reward of Venture Capital: An Interview with Michael Ewens

Michael J. Ewens recently joined the faculty at Caltech as associate professor of finance and entrepreneurship after four years at the Tepper School of Business at Carnegie Mellon University. A native of Wisconsin, Ewens attended Washington University in St. Louis, majoring in mathematics and economics before moving on to UC San Diego for graduate studies in economics.

Ewens explains how he discovered venture capital through a summer job in graduate school, and shares his ambitions for his future at Caltech.


What field do you specialize in?

Entrepreneurial finance. I study the financing and development of high-growth start-ups such as Twitter, biotech start-ups, or new clean-energy firms. I study how money and investors get matched to start-ups and what value is created after they are financed. What are the factors that lead to them receiving the right money at the right prices, or failing to? How is capital raised?

Entrepreneurship is a fascinating area because it is at the extreme of many problems that come up in economics. A classic issue in economics is what happens in a situation where one person knows a lot more than the other—information asymmetry—and can take advantage. This is often the case in entrepreneurship, where you have people who are new to the business world seeking venture capital from people who have expertise in finance and money.


Is your interest in what goes into making start-ups successful purely theoretical?

No, it's a very important issue in practical terms. For example, Caltech allocates a part of its endowment toward the private equity asset class, which includes venture capital. So understanding how investments in start-ups behave in terms of risk and return is fundamentally important.

And, of course, it's important for entrepreneurs and policy makers. Most government officials think it's good to have more start-ups, and they think they know how to set policy to lead to more start-ups. But every economist who studies entrepreneurship comes from the position that we really don't know how to encourage start-ups and make them more profitable. Take the example of health care. It is thought that one reason people don't leave large companies to start new ones is that they are locked into their health insurance plans. Now, with the introduction of the Affordable Care Act ("Obamacare"), we can begin to look at the data and see if this supposition is correct.


How did you get interested in venture capital?

It was happenstance. I was a graduate student in economics at UC San Diego, studying international trade. I wanted to live close to campus, near the beach, but not in graduate housing. To do that, I needed to earn more than my research-assistant salary. So I started consulting for a venture-capital firm called Correlation Ventures. They are a unique firm. They introduced a different kind of econometrics into venture capital, the sort of techniques used in Moneyball, which revolutionized the business of creating a winning baseball team. I fell in love with the idea.

I had initially planned to work for them just over the summer, but they offered me access to a wonderful set of data that I could use in my graduate studies, so I stayed on as their "data guy." I continue to work as a part-time advisor to the fund.

What inspires you to choose particular topics in venture capital for further research?

Venture capital is a very dynamic field, so new research topics are not hard to come by. The challenge is collecting rich data and using quality empirical strategies.  For example, changes on the legislative side over the last couple of years have provided unique research opportunities. The JOBS Act [Jumpstart Our Business Startups Act], passed by Congress in 2012, significantly alters the way start-ups are financed, who can invest in them, and how such firms can eventually go public. These policy changes provide what economists call natural experiments. For example, the legislative changes make it possible for us to test theories concerning the types and magnitudes of financing frictions facing start-ups.

The underlying assumption behind such policies is that having many new small businesses is great, because, as everyone says, they create the most jobs. But what people forget is that new small businesses also destroy the most jobs, because most small businesses fail. So that's part of my research: to shine a light on what makes start-ups succeed or fail.


Are there other important issues for venture capital that you study besides changes in legislation?

Yes. For example, I'm working on a paper now with some coauthors that investigates the impact of new cloud software that has grown rapidly in use since 2005. Think the Amazon cloud. This software has made it possible for individuals to start certain types of businesses with very little money: information technology businesses, say, but obviously not something like developing new drugs, for which you need laboratory space. Then we can ask how this changes the venture capital investment choice. For example, if an investor can give you ten thousand dollars rather than a million to get your company up and running, how does that affect the investor's selection of entrepreneurs and the fate of start-ups generally?

It's also becoming easier and easier to collect disparate sources of economic data from the web. So questions that economists have studied in the past using small datasets can now be checked against much larger datasets of hundreds of thousands of observations.


What will you be teaching at Caltech?

Next January I'm going to teach a graduate course in applied econometrics, and in the spring I will be teaching a class in venture capital finance [BEM 110] that mirrors a class I taught to MBA students at Carnegie Mellon. I'm not worried about the undergrads at Caltech handling the course though. In fact, I'm looking forward to being able to throw more mathematics into the course. This course will give students background on how investors and entrepreneurs behave through the lens of economics and finance.


What attracted you to Caltech?

I liked my time at Carnegie Mellon, because in a business school you have a very close connection to industry and the "real world." But Caltech is "research first" in a way that a business school cannot be. Writing as many quality papers as possible and teaching the kind of things I was taught as a PhD student is what I'm best suited for, I think, and Caltech is the perfect place for that. In 15 years, I want to look back and say that I took on some risk and made a small but significant impact, changing the way people think about economics. Caltech shares that interest.

Exclude from News Hub: 
News Type: 
In Our Community
Tuesday, October 7, 2014
Red Door Cafe – Winnett Student Center

Samba and Salsa Exhibition

Tuesday, October 7, 2014
Center for Student Services 360 (Workshop Space) – Center for Student Services

Thirty Meter Telescope Groundbreaking and Blessing

Finessing Finance: An Interview with Richard Roll

"Everything has a price," the saying goes, and though that might sound cynical, taking the adage seriously can lead to a lifetime of fascinating inquiry. Just ask Richard W. Roll, who recently joined Caltech as the Linde Institute Professor of Finance. After spending nearly 40 years at the UCLA Anderson School of Management exploring everything from interest rates to asset portfolios, Roll has moved slightly eastward to help Caltech expand its offerings in finance.

In addition to his role as professor, Roll has consulted widely over the course of his career for agencies and corporations that are commonplace in the American lexicon: AT&T, Freddie Mac, J.P. Morgan, AARP, and the Securities and Exchange Commission, among others. Roll recently spoke with us about his research, his career, and his new position at Caltech.


What excites you about finance?

It's just enjoyable work, and very practical too. For example, mergers and acquisitions, a topic of a recent paper of mine, is a gigantic business. About 15 percent, by market value, of firms in this country are involved in mergers every year. If you can make these deals more profitable and fair for everyone involved, you've had a real impact.

I wrote a paper back in the 1980s titled "The Hubris Hypothesis of Corporate Takeovers." It was a theoretical paper arguing that when companies bid in an auction to acquire another firm, they almost always pay more than anyone else is willing to, and they often pay a penalty for that. It's known as "the winner's curse." If you win the auction, you bid more than anyone else would bid. As a result, the price of your stock often drops, because the market is saying you probably made a mistake.

In this recent paper ["The Hubris Hypothesis: Empirical Evidence"] we were able to test this hypothesis—to actually see how much overbidding goes on in mergers and acquisitions by looking at a large dataset of 4,299 mergers from 1986 to 2008. We found that indeed there is a lot of overbidding. It's pretty prevalent.


Is mergers and acquisitions your major research focus?

No, no. There are two main areas of finance: corporate finance, which is what corporations do, and capital markets—the stock market, the bond market, things like that. I do work in both. Another of my recent papers ["Resolving the Errors-in-Variables Bias in Risk Premium Estimation"] gives a new angle on old questions, proposing a technique for making more accurate predictions about stock market behavior.


What will you be teaching at Caltech?

This fall I'm teaching the beginning finance course, BEM 103. Seventy students have registered for the course, most of them undergrads.


Do you have any sense of how teaching here might compare with your experience at UCLA?

At UCLA there are no undergrads in the business school—all the students are MBA or PhD students. But I hear that the students here are very smart, and I'm looking forward to working with them.

I imagine that most of my students at Caltech will be in engineering or physics or something like that, but they might want to change to economics at some point, or they might just want to learn something about the business side of the world as they begin their scientific careers. A lot of Caltech faculty members are also interested in financial matters. Some have patents and want to start companies, but they don't know anything about finance. I've talked to a few already.


Did you always know you wanted to study finance?

No, initially I was an engineer. I studied aeronautical engineering at Auburn University and then went to work for Boeing. It was only when Boeing decided to prepare me for engineering management and sent me to the University of Washington to get an MBA that I learned anything about business. I found I liked it better than engineering, so I quit my job with Boeing and went to the University of Chicago to get a PhD in finance, statistics, and economics.


Do you have any collaborations planned at Caltech?

Michael Ewens also just started here, and we might do something together. In fact, he and I are planning to run a finance seminar series. We have already invited Steve Ross, a Caltech alum [BS '65], professor of financial economics at MIT, and a member of the Caltech board of trustees. Ross, who studied physics at Caltech, isn't the only Caltech alum who has gone into economics. Nobel Prize winner Bob Merton [MS '67] is also an economist on the faculty at MIT.

Listing Title: 
Finessing Finance
Exclude from News Hub: 
News Type: 
In Our Community


Subscribe to RSS - HSS