The drug discovery industry is facing major systematic failure in discovering novel drugs. Although some of this lack of productivity may be attributed to the low hanging fruit already being harvested, leaving more challenging diseases and interventions to develop, there are other factors at play. A key recent change in technology has been the essentially free and ready access to low cost computing and data; however, it is clear that this alone is not enough, and smart integration and hypothesis generation and in silico validation across this data will be essential before impacts on progress are realised. In this presentation I discuss the building of a large public drug discovery database connecting compounds through to pharmacological effects and molecular targets, the provide an overview and comparison of successful and unsuccessful drug discovery and development programs, leading into insights for future likely successful target systems, then this view is integrated into the framework of large-scale personalised medicine. Finally, the issue of scientific reproducibility, and it’s impact on data analysis is discussed.