machine learning – Horizon Centre for Doctoral Training Blog

22 May 2024

My Placement Experience: Lessons and Triumphs

post by Kuzi Makokoro (2022 cohort)

Reflecting upon my placement, a key lesson around the most important decision to make before starting a placement, was to consider the specific skills and experiences I hoped to gain. This past summer, I had the opportunity to partake in a placement with my industry partner, Co-op, which turned out to be a remarkable and invaluable experience for my professional and academic growth.

Before finalising the arrangements for the placement, including setting the dates, duration, and defining the project, a series of discussions took place between my supervisors and me. We assessed the multitude of opportunities that this placement could offer. It was during these deliberations that the versatility of a placement’s benefits became apparent to me. One option is to align the placement activities with your ongoing PhD research, ensuring that the work is not only relevant to your academic pursuits but also meets the strategic needs of the industry partner. This synergy often results in a mutually beneficial outcome that propels your research forward. Another approach could be to take a break from academic work to gain a breadth of experience in the industry, thereby expanding your professional network and engaging in projects that are also of interest to you.

Having spent the last nine years in commercial roles within various industries and capacities, I was already familiar with the dynamics of industry life. This pre-existing industry experience informed my decision to select a project that complemented my PhD research. Once I made this strategic choice, the focus shifted to pinpointing a suitable project. After numerous consultations, we collectively decided to concentrate on the Healthy Start Scheme—a government- initiative designed to aid low-income families with children under four by providing them with essential foods like milk, fruits, and vegetables. This project was not only crucial to my industry partner but also resonated personally with me, as it underscored the meaningful impact of data-driven initiatives on societal well-being.

The research objectives for the placement were ambitious: to utilise predictive analytics to predict the uptake percentage of the Healthy Start Scheme using food insecurity measures and to apply machine learning techniques to identify and understand the factors that influence uptake significantly. Working in conjunction with an industry partner meant that the practical application of my research findings could potentially aid the partner in supporting and promoting the scheme more effectively.

Entering the placement, I had certain preconceptions about how the experience would unfold, the nature of the work I would engage in, and the interactions I would have with various stakeholders. However, the practical aspects of my placement differed from my initial expectations. I quickly realised that my chosen topic necessitated a more independent working style, with periodic contributions from my industry partner rather than continuous collaboration. This shift led me to a new understanding of the role of a researcher in a consultative capacity, working in partnership with an industry entity. The experience also allowed me to lead a research project autonomously and understand the nuances of impact work. My responsibilities included initiating regular meetings with my contact at Co-op, seeking input and assistance from the wider team when needed, and managing the project’s pace and milestones.

In hindsight, although the timing of the placement originally seemed appropriate, I later reflected on whether doing it later on in my PhD program might have allowed for a richer output. The project demanded proficiency in skills that I had not yet mastered at the time, necessitating a steep and rapid learning process. This included developing an understanding of predictive analytics methodologies, acquiring proficiency in programming languages such as Python, learning about digital data collection techniques, and interpreting complex model results.

Consequently, what was initially set out to be a three-month placement evolved into a five-month project, as additional time was required for me to learn, adapt, and then effectively engage in the research. I adopted various learning strategies, such as the accelerated learning techniques outlined in Jake Knapp’s book “Sprint: How to Solve Big Problems and Test New Ideas in Just Five Days,” which aided me in assimilating new information rapidly, trialling different approaches, and breaking down the project into smaller, more manageable tasks. Ultimately, I was able to enhance my skill set and produce actionable insights from the project, though a better approach to defining deliverables within the given timeframe would have been advantageous.

The research outcome was insightful; we identified several strong predictors within the model, such as income deprivation and language proficiency, as well as intriguing variables like household spending on fish and the caloric density of purchases. We explored various ways in which my industry partner could leverage these insights to better support the Healthy Start Scheme in communities where it is most needed.

In summary, the placement was a journey of adapting to a different work environment, setting pragmatic goals, and scaling up professionally. This learning experience has been instrumental in advancing my PhD work. It reiterates my initial emphasis on the importance of understanding what you seek to achieve from a placement. Although I had not initially set out to acquire these specific skills through the project, they have proven to be of great value as I continue with my PhD journey. Looking ahead, I am excited about the prospect of converting this project into my first published academic paper.

28 February 20245 November 2024

Insights from the Oxford Machine Learning Summer School

post by Gift Odoh (2022 cohort)

Between the 13th and 16th of July, 2023, I attended the Oxford Machine Learning Summer School at the Mathematical Institute of the University of Oxford for health applications. The course organised by the AI for Global Goals in partnership with the University of Oxford’s Deep Medicine and CIFAR, was focused on advanced areas in machine learning (ML), ranging from statistical and probabilistic approaches to representation learning (an ML approach based on representations of data that make it easier to extract useful information when building classifiers or other predictors [1]), specialised techniques for complex data structures, computer vision, knowledge representation and reasoning, and the integration of symbolic and neural approaches for enhanced AI capabilities.

My interest in the school was from the opportunity for exposure to valuable exploration into ML’s diverse applications and expectations towards uncovering connections between ML techniques and their relevance to my PhD research, which focuses on robotic teleoperation and human-robot interaction, particularly concerning mental workload indicators and how they can inform robotic assistance schemes in teleoperation. I also saw it as an opportunity to meet people of similar interest in the field and visit the renowned city of Oxford and its University of Oxford colleges, some known to have rich histories.

The first couple of sessions focused on how we understand our environment as humans – covering how we represent the world and its actual truth from different observations. These sessions paved the way for representation learning and how intelligent systems can extract useful information from features present in data, particularly when there are no labels. S. M. Ali Eslami, in his session on representation learning without labels, underscored the importance of labels to effective machine learning but demonstrated how learning can still be achieved when label collections are impossible by showing how different encoders make representations (understanding) of data from inputs as well as how this is reversed though generative models that make real-world estimates of this representations. While most sessions focused on probabilistic models based on generative techniques and casual machine learning, which focus on the learning process, Professor Pietro Lio from Cambridge presented an intriguing session on graph representation learning, which is a form of machine learning useful for organised data in the form of networks or graphs where points of data (nodes) are connected with edges (relationships), making an interesting case for utilising graphs as they are everywhere in research. Although most of the application areas were in molecule generation for proteins and drugs, its application in extracting meaningful insights and predictions from relational data can be applied to model robotic assistance schemes that respond to mental workload within a complex framework where nodes can represent operators’ mental states such as attention levels, stress levels or task demands and the edges can signify the relationship and dependencies between them.

Another key aspect of the course was computer vision for ML. Some of these sessions were on the evolution of computer vision and its techniques and unsupervised visual learning for ML applications, particularly medical imaging. Understanding the progress of computer vision and where it stands today has practical implications for my work, given that vision is integral to teleoperation interaction. Christian Rupprecht presented the stages for understanding a scene, including scene classification, where the general scene is described; object detection, in which various objects in the scene are identified; segmentation, which involves dividing the scene into meaningful, distinct parts and regions; scene graph, which describes the positional relationship between objects; description in which an improved interpretation of the scene is obtained and hierarchy which informs the how scenes are decomposed into objects, parts and materials.

It was, however, useful that the summer school was not just about machine learning techniques in isolation. The segment on Bridging Machine Learning and Collaborative Action Research emphasised the importance of collaboration, especially in areas like digital mental health. For example, the limitations of applying findings from social media data for health states generalisation, methodological issues, challenges understanding other attributes (e.g. offline attributes) and threats of relying on single data sources. This challenge emphasises the indispensability of interdisciplinary collaboration, which resonated deeply with my belief in merging human-robotic interactions with other disciplines for a more holistic approach to tackling the interdepending challenges of robotic assistance in teleoperation. Although it seemed to me that some of the techniques were unique in their approach and application to specific conditions, I see an opportunity for careful examination into how some approaches could come together to enhance robotic autonomy and facilitate better human-robot interaction.

In conclusion, the school has added depth to my understanding by expanding my academic horizon to approach my research through the sessions, including those that felt directly applicable and the seemingly marginally relevant ones. It is also noteworthy that the school was also an opportunity to meet other PhD students from diverse backgrounds and corners of the world. Our interactions provided valuable global perspectives on the various ML applications in health research. I must also add that I had the opportunity to explore the historic city of Oxford and its renowned University of
Oxford colleges both through guided tours and lone walks, which offered a cultural immersion and ignited a sense of academic inspiration. My interactions with students, researchers and industry professionals allowed me to forge meaningful connections in machine learning that broadened my understanding and opened doors to potential future collaborations and opportunities. Overall, it was a transformative learning experience that equipped me with a global network to renew my sense of purpose in my research and professional journey.

References
[1] Y. Bengio, A. Courville, and P. Vincent, “Representation learning: A review and new perspectives,” IEEE Trans Pattern Anal Mach Intell, vol. 35, no. 8, pp. 1798–1828, 2013, doi: 10.1109/TPAMI.2013.50.

12 October 202318 October 2023

My Internship at Capital One

post by Edwina Borteley Abam (2019 cohort)

My internship at Capital One started mid-November 2021 and ended mid-May 2022. Capital One is a credit card company originally situated in the US with two branches located within the UK in Nottingham and London specifically. I interned at the Nottingham branch over a period of 6 months, on a part-time basis.

The company has several departments and units. I was placed within the Data Science team which forms part of the wider Data group within the organisation. There are three main sections within the Data Science team namely Acquisition, Customer Management and the Bureau team. The Acquisition team concentrates on building models to score new credit card applicants. The Customer Management team focuses on managing and monitoring the behaviour of all existing credit card customers and credit line extensions and the Bureau team manages all data and information exchanged between credit bureaus and Capital One. For my daily work, I was placed within the Customer Management team and collaborated on two related projects- (Onescore2 and Challenger Model).

The internship:

Onescore2 project involved creating machine learning models to manage the behaviour of existing credit card customers. I worked together with my manager to build models to predict customers likely to default on their cards over a defined period of time. We used R (a statistical programming software) as the main tool for the project. The specific activities assigned to me on the project involved creating the R program files for executing the models, monitoring the progress of the models’ execution, collecting and interpreting model results, and updating the GitHub repository with project outputs. The previous knowledge and skills acquired from the Data Modelling and Analysis course in year two of my PhD helped me understand the technical details involved in the analysis and to carry out my assigned duties effectively on the project.

The second part of this project is the Challenger Model project and it involved building different models in Python to compare their performance with Onescore2. The project was an exploratory study of different conventional models in predicting the likelihood of default. The Challenger Model project serves as a baseline to compare with results from my PhD work, which potentially could form part of my PhD thesis. As this phase of the project is linked to my PhD work, I benefitted from the guidance and input of my supervisor. While working on the Challenger Models, I held periodic meetings with my manager, supervisors and other members of the Data Science team where I presented on progress and discussed possible directions for the project. I also took part in weekly stand-up sessions where all associates within the Data Science team shared updates on ongoing projects.

My reflections:

Looking back on my internship, overall the experience has been insightful, an exciting journey and a time of personal development. I have grown and evolved in several areas in terms of interpersonal and professional skills.

Upon arrival in the first week of the internship, my manager was deliberate to arrange informal meetings and chat sessions with other members of the Data science team. These introductions and chats exposed me to a range of people in various roles and at different levels of leadership in the team. It helped to quickly integrate into the team to create new connections and meet new people. Despite being naturally reserved, I enjoyed the conversations much as everyone was friendly. I was encouraged to step out of my shell to interact with more people. During my interactions, I seized the opportunity to ask all the lingering questions I had on the topic of credit scoring which is also at the heart of my PhD research. Each person was friendly and particularly eager to answer all my questions and chat about the work they do.

Apart from the Data Science team, I had the chance to speak with other associates in other departments of the company and that experience was reassuring and enhanced my confidence at the workplace. I got first-hand experience in mixing with different people from different backgrounds in an office setting and learning to blend with them. The conversations in the first couple of weeks opened up my understanding more on the details of credit scoring and credit cards. I got more understanding of how the different teams work together to make credit cards available to people and how customers are managed and credit lines extended. I had the opportunity to join major meetings and to hear updates on projects being worked on within the different departments of the organisation. This also gave me a wider view of other aspects of the business. I was able to connect how the theory of credit scoring I had read in books and research articles played out practically in the real world through this experience.

During my internship, I worked both from home and the office. Every week during the first few months, I worked three days at home and two days in the office. I found commuting to work on time a discipline to develop as this was my very first time working outside of industry. Although challenging initially but got easier with time. The regular catch-ups and progress updates with managers and my supervisor were sometimes strenuous and nerve-wracking, however, it trained my communication and presentation skills.

The work culture in Capital One challenges associates to give their best on the job but at the same time encourages relaxation and places such high priority on wellbeing. Unlike other work environments, I was surprised to find several fitness and relaxation points like the gym, tennis and pool table strategically placed in the Capital One building to support associates. In addition, during my internship, the company observed a day of fun activities for its associates every quarter of the year just to have a break from work. This shaped my perceptions about the working environment.

Capital One is the industry partner for my PhD and I was privileged to have access to their data for my PhD work. Through my connections with the team members, I was able to easily recruit participants for my first PhD study which I believe would have been difficult otherwise without the internship. Overall, I enjoyed the internship and the experience has been beneficial not only for my PhD but for my personal development.

11 May 202210 May 2022

The joy of building things. My reflection on the internship at BlueSkeye AI

post by Keerthy Kusumam ( 2017 cohort)

September 2020 – January 2021

I interned at BlueSkeye AI, a company that delivers ethical AI for supporting mental health for the vulnerable population using facial and voice behaviour
analysis. The long term vision of BlueSkeye AI is to ’Create AI you can trust for
a better future, together.’ The goals of my PhD aligns perfectly well with that
of BlueSkeye, where comprehending various facial behaviours to recognise markers of mood disorders forms a core part of the work. The company BlueSkeye AI is cofounded by my PhD supervisor Prof Michel Valstar and the teammates include several of my past PhD colleagues. The following pointers are my reflections on my four-month-long internship at BlueSkeye AI.

The joy of building things that work. The internship at BlueSkeye
rekindled my enthusiasm to build systems that work in the real world, face real
challenges, and create real impact. When I joined, BlueSkeye AI had a product
that was going to be released to the market and what I had to build would
then be integrated into this product. That made it extremely well-defined as a
problem, where we were not trying to define a problem itself but rather engineer a solution that needs working on real-world data, leveraging the cutting-edge computer vision/machine learning research.

Real World Vs Research World. My emphasis on real-world data stems
from my divided self where I am both a computer vision researcher as well as a
roboticist. Before doing my PhD I spent nearly 4 years in a robotics research lab with an active collaboration culture – where everyone in an open-plan workspace contributes to projects irrespective of their original funding sources. This cultivated the exchange of ideas across disciplines – computer vision, cybernetics, robotics, reasoning, machine learning etc leading to very creative and interesting bodies of work. In robotics, computer vision is often a tool that it relies upon to make decisions, which means robustness and consistency precedes accuracy. In computer vision research, however, beating the state-of-the-art on benchmark datasets seems to be the key marker of success. I enjoy both these aspects and the internship opportunity at BlueSkeye AI gave me just that – a place to bring those together. I got to build a computer vision-based social gaze estimation system that works on a smartphone. The challenge was about finding the right balance between exploration and exploitation. Here I had to optimize for efficiency, usability, practicality, simplicity and data efficiency along with the standard performance metrics that I use in research.

The Team and Teamwork. My onboarding was seamless, owing to the
hands-on approach adopted by the BlueSkeye AI’s leadership. I was also familiar with the team, so I was lucky to enjoy an incredibly friendly and supportive environment. The weekly meetings where everyone discussed progress or the issues they faced, posed as learning sessions for me. I understood the value of communication and brainstorming from the team as a whole, to keep up the momentum. I worked in sync with the lead machine learning engineer who set up several documents and code specifically for me, that removed my roadblocks to integrate the module into a mobile device. I also learned how managing tasks in a time-critical manner helps save time and resources for the company as well as yourself.

Importance of values. One should never compromise on their values
while working for a company and it is important to work in a place where value
systems align. BlueSkeye AI’s five-year mission is: ’To create the most-used
technology for ethical machine understanding of face and voice behaviour that enables citizens to be seen, heard, and understood.’ I was astonished by their sensitivity towards mental health research, strict adherence to ethical guidelines while handling data, being transparent to the data volunteers about their data and having numerous clinicians with great expertise on board. Being part of the company albeit during a short internship provided me with a sense of purpose and I felt attuned to my values.