Call for Participants: Fake Image Detection w/ Eye tracking

post by Matthew Yates (2018 cohort)

I am a final year Horizon CDT PhD student partnered with the Dstl. My PhD project is about the detection of deep learning-generated aerial images, with the final goal of improving current detection models.

For my study, I am looking for participants to take part in my short face-to-face study on detecting fake aerial images. We have used Generative Adversarial Networks (GANs) to create these.

I am looking for participants from all backgrounds, as well as those who have specific experience in dealing with either Earth Observation Data (e.g. aerial imagery, satellite images) or GAN-generated images.

Purpose: To capture gaze behaviour during the detection of GAN generated fake images from real aerial photos of rural and urban environments. Participant accuracy and eye movements will be recorded.

Who can participate? This is open to anyone who would like to take part, although the involvement of people with experience dealing with related image data (e.g. satellite images, GAN images) is of particular interest.

Commitment: The study should take between 20-40 mins to complete and takes place in Room B73 Computer Science Building Jubilee Campus

Reward: £10 amazon voucher for your participation

How to participate? Email me at matthew.yates1@nottingham.ac.uk with dates/times that you are free to arrange a timeslot

For any additional information or queries please feel free to contact me,

Thanks for your time,

Matt

+44 (0) 747 386 1599 matthew.yates1@nottingham.ac.uk 

Map Check

post by Vincent Bryce (2019 cohort)

This summer saw me walking St Cuthbert’s Way, a 100km hiking trail in the Scottish Borders/Northumbria area with my children. It was a great trip, plenty of challenge but achievable, and I’d recommend it:  https://www.stcuthbertsway.info/long-distance-route/

The trail was well signed, but needed us to use our map and compass in places. It’s three years since I started the PhD, two of which were part-time, and it feels like a good time to check where I am and where I’m going:

Where am I

I’m starting the third year of a part-time PhD in the Horizon CDT, focussing on responsible research and innovation (RRI) and Human Resource information systems. This is about exploring how organisations can innovate responsibly with digital technologies, the challenges this involves, and some of the specific issues for HR technologies.

I’ve chosen a Thesis by concurrent publications route => a set of related studies rather than one overarching thesis.

Where am I going

I am going to complete my PhD and plan to come back into full time HR work, applying the insights into my digital HR work. The experience of being a student and researcher at the University I work will help me keep a strong customer focus.

What have I done so far

Following a year of taught activity about a range of digital economy and computer science topics, I’ve completed a series of studies and articles.

Highlights include a study on published Responsible Innovation case studies exploring the benefits of RRI, pieces on HR analytics and their ethical implications, presenting at the Ethicomp, CIPD Applied Research, and Philosophy of Management conferences, and critical articles on wider challenges for responsible innovation such as low-code technologies and crosscultural aspects.

I’ve seen new ideas and emerging technologies, and built skills in cding, data science, writingm writing to data science, and from bot based blogging, digital watercoolers and AI coaching to augmented and virtual reality tools.

What are my main findings to date

  • Responsible innovation practices are associated with business benefits.
  • Digital technologies, in particular ones users can reconfigure for themselves, pose challenges for responsible innovation methodologies, because these tend to rely on the technology being developed in ways which anticipate and respond to societal needs. End users, rather than scientists and developers, are increasingly able to innovate for themselves.
  • Algorithmic HR technologies give HR new capabilities, but are linked to some ethical concerns and have features which imply a need for responsible innovation and implementation.
  • Interviews with HRIS suppliers have limited opportunities to engage wider stakeholders and anticipate downstream impacts, creating reliance on client organisations to reflect on how they apply the technologies.
  • The knowledge and values of HR practitioners are a critical constraint on responsible algorithmic HR adoption.

What are my priorities for the coming year

Completing my thesis synthesis document; concluding in-progress studies on the increasing scope of employee data collection, and HRIS supplier and practitioner perspectives; and getting in position to submit by Sep 2023.

Right – onwards! I’ve recently attended the Productivity & the Futures of Work GRP conference on Artificial Intelligence and Digital Technologies in the Workplace to present about my study on the increasing scope of employee data collection and hear about what’s hot and what’s not.

originally posted on Vincent’s blog

What is Pint of Science?

post by Peter Boyes (2018 cohort)

“The Pint of Science festival aims to deliver interesting and relevant talks on the latest science research in an accessible format to the public – mainly across bars, pubs, cafes and other public spaces. We want to provide a platform which allows people to discuss research with the people who carry it out and no prior knowledge of the subject is required.”

This was a new one for me, a collision of worlds. I’ve spent the last 8 years in Nottingham, studying for my undergraduate, master’s, and now PhD. I did some extra bits with my course, PASS leader in its first year in the school of mathematics, and some events here and there as a PhD researcher, but I stick mostly to my studies and then explore volunteering in places beyond academia. I’ve enjoyed helping coordinate sports clubs and competitions since joining university, but Pint of Science arose as an opportunity to combine my two halves. Volunteering and putting on events related to my studying.

I got involved at first as a general body to lend a hand on a couple of the nights, but moved into a Theme Lead role early on in the year when an opening popped up. About 9 months ago myself and my team of fellow volunteers were allocated our theme (Beautiful Mind – anything around human senses and the brain) and we set about recruiting speakers and planning our event. We had 3 evenings at Bunkers Hill, Hockley to fill, and grouped our 9 speakers into similar topic areas. These topics covered broadly Pain, Senses, and Mental Health. We checked out the venue space, and planned out schedules for the nights, with presentations, Q&As, and some activities for the audience such as quizzes (what else do you expect on a weeknight in a pub when you’re talking science). May flew round, and tickets got snapped up. The nights went fantastically, there was a buzz with the great speakers and the final night in particular packed out the venue space to end on a high note.

This side venture was a little outside my comfort zone, yes I’m familiar with volunteering and running events, and I’ve been in academia for 8 years, but the theme wasn’t in my area of expertise and science outreach is a new experience for me. I was supported well throughout by a great team more familiar with the topics and events like this one. I’ve learned a lot about outreach through these nights. This was for me about learning how to facilitate public outreach and conveying cutting edge research and expert topics to the general public, no easy task. The most revealing part of each night was being able to listen to the speakers talking to each other, some seasoned Pint of Science-ers, some new to the event. I also had the privilege of facilitating fellow Horizon CDT 2018 cohort member Shazmin Majid presenting her latest work.

This experience has given me confidence in presenting my work and how to go about it, equally how not to go about it. Avoid overloading slides with text and too much inaccessible specialist terminology. It’s fine to use some if you define them and get the audience up to speed, but need to find other ways to convey your research if every slide needs 5 terms defining or sub definitions, it breaks up any flow and makes it difficult to follow, particularly for non-experts in the field. Analogies are great, again not too many, and not too convoluted. I have been given advice before on using analogies as they can lead to misunderstanding of concepts if followed too afar, but a well-crafted one can enhance the audience understanding. Demonstrations or activities that let the audience learn through involvement rather than relying on a perfect explanation also seal the deal on a great outreach talk. The simpler the demo the more effective. Though, doing any of those things is no easy task.

I would encourage other CDT students to get involved in the coming years from either side, later stage PhD students and recently graduated alumni have a great opportunity to put your work out there. Early stage candidates should see how other researchers slightly further along your journey are engaging with this sort of outreach, it might even give you ideas about your own research.

 

 

 

Summer School Participation Reflection

post by Matthew Yates (2018 cohort)

I participated in the 2022 BMVA Computer Vision summer school which was held at The University of East Anglia. The summer school was aimed at PhD students and early-stage researchers who were involved in the research areas of computer vision, image processing, data science and machine learning. The event was held between the 11th – 15th of July and consisted of a full week of lectures from a variety of researchers who are experts in the field on a wide array of computer vision topics with an emphasis on the latest trends that are pushing the field forward. In addition to the lectures, there was also a programming workshop and social activities in the evenings such as a computer vision-themed pub quiz and a dinner held at Norwich Cathedral.

As the lectures covered a wide range of topics not all of them were strictly relevant to my own PhD project and research interests, although it was useful to be exposed to these other areas to gain some tangential knowledge. The event started with a lecture by Ulrik Beierholm on cognitive vision, how it functions and how it compares and contrasts with similar computational vision systems such as Convolutional Neural Networks (CNNs). As my own background is in cognitive psychology and computational neuroscience, I found the lecture very engaging, even if it was mainly reiterating ideas I had already studied during my Masters’ degree. The afternoon of the first day was given to a programming workshop where we were given tasks on a Google Colab document to help familiarise ourselves with using PyTorch and also programming some of the key parts of a deep learning model pipeline. Although these were fun and useful tasks, we were not given enough time to complete them as much of the first half of the workshop was taken up with technical issues in setting up the guest accounts to use the labs computers.

The second day started and finished later than the first, with more lectures and an event in the evening, a structure followed throughout the rest of the summer school. The first lecture of the day was on Colour by Maria Vanrell Martoreli. Going into this lecture with no expectations I came out of it having found it very useful, with a much deeper understanding of the role of colour in the interpretation of objects in an image, both in human and machine vision systems. These were followed by lectures on image segmentation by Xianghua Xie and local feature descriptors by Krystian Mikolajczyk.

The image segmentation lecture presented some of the latest methods being used as well as some of the common problems and pitfalls encountered by researchers implementing these methods. While these two lectures presented a lot of well-articulated ideas in their respective areas, they were fell out of my own research interests so I don’t think I got as much value out of them as others in the room.

The last lecture of the day was a rather densely packed overview of deep learning for computer vision by Oscar A Mendez. This was a very engaging lecture and with a lot of information including some good refreshers on more fundamental architectures such as MLPs and CNNs and a very intuitive introduction to Transformers, a rather complex deep learning model which are currently very popular in many research areas. In the evening we went into Norwich city centre for a bowling social event.

Wednesday morning consisted of lectures on Shape and Appearance models by Tim Cootes and Uncertainty in Vision by Neill Campbell. Both of these were conducted online over teams due to the presenters having caught covid after attending the CVPR conference the previous week. The shape and appearance models lecture was informative but not of much particular interest to me but the uncertainty in vision was quite interesting and the presenter managed to include a good level of audience engagement activities despite being over a webcam.

After lunch we had a lecture on generative modelling by Chris Willcocks. This was a very interesting lecture as it covered the current trends in generative modelling (e.g., GANs, Transformers) and also looking at the architectures which have the potential to be the future of the field such as diffusion and implicit networks. As my own work looks at GANs, I found this talk to be particularly enlightening and also comforting as it agreed with many of my own arguments, I include in my thesis such as the current issues with using FID as an evaluation metric. In the evening we attended a dinner at Norwich Cathedral which gave everyone a good time to network and discuss the week’s events with other members of the summer school.

Thursday consisted of another full day of lectures on various topics in Computer Vision. These were Unsupervised learning by Christian Rupprecht, Structured generative models for vision by Paul Henderson, 4D performance capture by Armin Mustafa, Egocentric vision by Michael Wray and becoming an entrepreneur by Graham Finlayson. At this point in the week, I was starting to become a little overwhelmed by the amount of information I had taken in on a range of highly technical topics. I think that it could have been more beneficial to have a slightly less dense schedule of lectures and mix in some workshops or seminars to fully take in all of the presentations. Despite this, I did find a lot of value in the lectures on this day, particularly the unsupervised learning lecture in the morning. The evening social event was a relaxed computer vision pub quiz, with a mix of themed questions about computer vision, AI, local and general knowledge. This was again a good time to get to know the other attendees and I thoroughly enjoyed it despite missing out on first place by a couple of points (I blame that on the winning team having a local).

Friday morning consisted of the last couple of lectures of the event. The first one, Art and AI by Chrisantha Fernando being particularly insightful and perhaps my favourite of the week. This lecture, by a Deepmind researcher looked at the state-of-the-art generation models such as Dall-E and asked whether an AI could actually create something more than a picture but what we would consider real “Art”. To examine this idea the speaker then dissected what we mean by the term “art” and emotions using computational terms and ideas and discussed the possibility of AI art through this viewpoint. I found the mix of cognitive science, computer science and philosophy to be very engaging as this cross section of AI is where my own passion for the subject lies.

After the event finished at midday, I met some of the speakers, organisers and attendees for lunch to chat and reflect on the week. Overall, I found the summer school very enjoyable, if not a little bit lecture heavy, and would definitely attend again. I came back from the trip eager to try out some of the more intriguing models and architectures discussed and I will also be going back over some of the key slides when they are released.

 

 

Legit Conference or Scam?

post by Peter Boyes (2018 cohort)

“I’m embarrassed and disappointed.” That was the opening line of an email to my supervision team and CDT administrators. This email was a reaction to attending a virtual conference in November to present a paper and hear about research in one of the fields that my PhD spans. The conference I had submitted to and was attending, thankfully virtually, appeared to be some sort of scam. I managed to be the first presenter on the first day, but it unravelled after that. A shamefully quick search online reveals “The World Academy of Science, Engineering and Technology or WASET is a predatory publisher of open access academic journals. The publisher has been listed as a “potential, possible, or probable” predatory publisher by Jeffrey Beall and is listed as such by the Max Planck Society and Stop Predatory Journals.” (https://en.wikipedia.org/wiki/World_Academy_of_Science,_Engineering_and_Technology, accessed 18th November 2021).

In this post I’ll talk through some of the process of writing up and submitting the paper, as many of my peers have done, but will add in some detail of the course my experience ended up taking.

The paper would have been my first, it was a write up of the motivations, method, and findings of the first study in my PhD. This was an exploratory interview study with university project management group members around the decision making process for two capital projects that were (at the time of starting) recently completed at the university. The motivation for the exploration was to inform the next stage of the PhD that would form the bulk of my thesis.

The paper was co-authored by my supervision team. The most input came in supervision meetings, rather than a cowritten/collaborative document from some group projects, they contributed throughout with guidance in the study; the design stage, sounding out planning; the analysis of interview data, namely talking through emerging themes and subthemes, then a second round of analysis; then reviewing a couple of drafts as it was being written. The writing feedback reflected the materials shared with them. The first was mostly a skeleton highlighting a structure that could be used to explore the study, from background and motivation through to methodology, results, analysis and importantly for this study the discussion and future work. Second review was larger with a few of the themes and subthemes drafted. Key comments that came back included refining the lengthy sections into manageable portions with a clearer narrative to them. The main input was adding some more summary subjective qualitative analysis to the themes, partly to save those skimming through from needing to read all the excerpts shared and re-treading my whole journey, and partly to introduce some more subjectivity, some opinion to the data I was presenting, something I needed a push to do with my rather quantitative background in mathematics. Finally, a cut down to reduce the length, cutting some repetition and essentially waffle that had crept in when drafting the sections and finding the narrative. Their guidance on my first paper was hugely valuable and informed some of the earlier design stages of my second study.

As mentioned earlier in this post, the paper was a write up of my first study, an exploratory one that laid the foundation for some more directed reading into the literature, and ultimately to the main study I am designing/carrying out now based around group decision making. This main study is addressing one of the future directions of research suggested by the paper in incorporating metadata and context to data that is presented to decision makers.

I was finishing up my first study and getting stuck into the write up process, looking for a suitable place to try and publish or conference to try and engage with, both to present and to find people doing research in the same domain as me to hear from. There was some self-imposed pressure after not finding or picking one out earlier in the study process to find a conference to submit to. I jumped at what looked like a great opportunity, it was a tailor-made conference for my paper/study, “Decision Theory and Decision Support Systems Conference”.

WASET have a paper submission and feedback platform on their website. You create a log in, submit your details and paper or abstract drafts for any of their conferences, and then communication with the organisers are done with messages on platform rather than over emails. All seemed easy to me, just some administrative boxes to tick. These messages and the platform cover most communication: paper has been sent off for reviewer comments; submission status to check back in on; updates or reuploads of submissions with revisions. My submission was initially under their abstract-to-full-paper option, it worked well to get a deadline to sort out my paper to around 80% done and the abstract off to them. This came back quickly, the reviewing of abstract had been completed and I could upload my full paper when ready with the second deadline now in place. That was submitted a couple of weeks later. I had feedback from a moderator on the platform that there were no formatting issues and that they may be in contact after further reading. Comments from reviewers eventually came back and were about removing the questions in the discussion section, and the pieces lifted from the main body introduction section into the abstract. This seemed like minimal feedback, a little odd as from hearing what some of my peers had been through, I was surprised at small changes. They seemed stylistic requests of the conference, but some rephrasing and tidying up and I was done. The paper was accepted. Chalked it up to a strange one but happy to have my first paper in somewhere and what felt like the home straight with the conference and presentation itself left while I cracked on with more design for my next study.

The dates advertised on the conference site for final deadlines kept rolling, but I sat fine with it as I’d done final edits to my paper and the “camera ready” version was in. I presumed it was undersubscribed and they were trying to get some more interest and papers submitted in the closing months and weeks. In the week ahead of the conference I set aside some time and reread my paper, pulled together a presentation for the conference with a few presenter notes, and did a small run-through so I would be ready on the day.

It rolled around, I was excited, it wasn’t my first virtual conference in the pandemic, I had attended a couple already such as GISRUK, and it wasn’t my first time presenting online as I had done so for a few internal presentations with the Mixed Reality Lab, and the Horizon CDT retreat. Both of these fell at an earlier stage of this study so in a way I’d practiced talking about this topic and fielded some questions already on the study design, the potential research impacts, and how it all fitted into the larger picture of my PhD. I received the meeting link on the morning, exact proceedings hadn’t been released which was a bit odd, but presenters had been grouped and I knew I was in the first wave before break 1 on day 1. It was supposed to be a 2-day conference with 3 groupings of talks across each one, and links to e-posters sent round for looking at outside of this time. It became apparent quickly that the conference, a rather refined area of “Decision Theory and Decision Support Systems” was being run as part of a series of concurrent conferences by WASET, some entirely unrelated to decision making or support systems. This again seemed odd but not a pressing issue as I haven’t got that much experience with conferences particularly not smaller ones. I thought maybe this is how they can be run. The issue became apparent when the zoom call started and I could only see one other name on the participants list from those I expected to see in my block. The other names were from the concurrent conferences, they were in the same call and room as me. I checked my link and it appeared to be the correct one, I was unsettled but trying to focus on being ready to present. The session chair opened up and read a running order, I was up first.

After I finished presenting, and the floor was open for questions, it collapsed. People were asking not about my study but why they were hearing about capital project management groups and decision making, and not what they were there to present on. The chair was pushing to move on to the next presenter. A quick search online and I found the Wikipedia article on WASET mentioned earlier in this post and a few other blogs about peoples’ experiences with the conferences and attendees being frauds. I exited the call quickly.

Hindsight is a wonderful thing, looking back at the process there were little indicators that something might not be quite what I’d hoped. Maybe I was distracted by a desire to get that first paper over the line and accepted, a badge on my sleeve and a boost of confidence for the next stage of my PhD. Maybe it should be chalked up to inexperience.

I wouldn’t wish for anyone else to go through this, particularly other early career researchers and PhD students. This still afforded me the opportunity to get on with writing up a study that could have sat in draft notes for months while I carried on with other research, the chance to go through the steps of writing up and submitting with my supervision team, albeit for a dud, receiving feedback and editing, and forming a presentation and presenting to an audience. It is a shame it had to happen this way and I am looking forward more now to writing up and submitting my next study. I hope my experience prevents someone else from falling foul of this sort of scam.

You can read our paper here: The Role of People and Data in Complex Spatial-Related Long-Term Decisions: A Case Study of Capital Project Management Groups.

Reflecting on my Journal Paper Submission

post by Matthew Yates (2018 cohort)

On 16th June 2022, my paperEvaluation of synthetic aerial imagery using unconditional generative adversarial network was accepted into the following August edition of the journal “ISPRS Journal of Photogrammetry and Remote Sensing”. This was my first paper to get published and also my first time going through the academic peer review process.

Although the paper has only just been published in summer 2022, the research began in 2019 with much of the initial work taking up a large portion of the first year of my PhD. The motivation for the paper was to take all the early research I had been doing for my PhD Thesis and produce a deliverable publication out of it. The reasons I was keen on doing this was that it would give me chance to get ahead of the Thesis writing end stage of the PhD as this paper would more or less cover the first empirical chapter, it would also help introduce me to the process of peer reviewing which I did not really have any prior experience with.

After delays due to the outbreak of COVID and the subsequent lockdown measures, the paper was submitted to the Journal of Machine Learning in summer 2019. The scope of the paper had been stripped back from the original ideas, with fewer models being benchmarked due to the inaccessibility to the Computer Vision GPU cluster at that time. After a couple of months of waiting, the Journal came back with a “Major Revisions” decision along with plenty of comments from the 3 reviewers. The reviewers deemed the paper to be lacking a substantial enough contribution to warrant publication at this stage and there was a general negative sentiment amongst the reviews. I then resubmitted a month later after responding to the reviewer’s comments and making large amendments to the paper only to get a rejection after the next round of reviews. As this was my first paper I had tried to get published I was rather disheartened in receiving this decision after spending so much time on the work. My supervisors were less concerned, having gone through the process many times, they told me this was fairly normal and it would be a good idea to submit to a different venue which may be more appreciative of the work.

In early spring 2021 I sent the paper to the ISPRS Journal of Photogrammetry and Remote Sensing. The decision I received here was another rejection, although this one came with much more constructive criticism and I was advised to revise and resubmit at a later date. As this was over a year on from my original submission attempt I had also been working on additional research for my PhD and then made the decision to incorporate some of these new results into the original paper, significantly increasing the contributions of the work. At this point my primary supervisor Mercedes Torres Torres, brought in Michael Pound to help add a new perspective on the work and give me some new feedback before resubmission. After submitting to his new journal again in October 2021 I was given a “major revisions” decision in December 2021, the reviewers who were a mix of old and new from the last failed attempt at the same journal had responded much more positively to the changes and additional content but thought it still required additional work and depth in some parts. Now in January 2022, I resubmitted, hoping that this would be the last time but received another round of corrections in April. At this point I was getting fairly fatigued with the entire process, having done the bulk of the work years ago and each round of revisions taking months. Luckily the reviews in this last round were positive and only one of the 3 reviewers called for additional work to be done. As I could see I was close to publication I went over all of this final reviewers’ comments in detail and responded accordingly as I did not think I could face another round of revisions and wanted to move on to other research. Luckily the next decision was an acceptance with all the reviewers now satisfied with the work.

The acceptance of the paper was a huge relief as it felt like the time and effort myself and my collaborators had put into the paper was finally vindicated. I was additionally pleased as this is my first publication, something I had been looking to achieve for a few years now. The paper also represents the first part of my PhD project and gives that whole stage of research more credibility now it has gone through the peer review process. Following this publication, I have now been invited to be on the list of reviewers for the journal if anything in my field is submitted. This is something I would be interested in doing to get an insight into the other side of the review process which could feel quite opaque at times. I have also been invited to publish further research in other related journals. These initial responses to the publication have shown that it was worth enduring through the rather lengthy and sometimes unpredictable process of peer reviews.

Outreach in the time of Covid

post by Luke Skarth-Hayley (2018 cohort)  

It’s tough to do outreach during a pandemic. What can you do to help communities or communicate your research to the public when it’s all a bit iffy even being in the same room as another person with whom you don’t live?

So, what can we do?

Well, online is obvious. I could certainly have done an online festival or taught folks to code via Zoom or Skype or Teams or something.

I don’t know about you, but I’ve got a serious case of online meeting fatigue. We’re not meant to sit for hours talking in tiny video windows. I’m strongly of the opinion that digital systems are better for asynchronous communications, not instantaneous.

So, what did I do?

I took my research online, asynchronously, through various means such as releasing my code as an open-source plugin for the Unity game engine and creating tutorial videos for said plugin.

Open-Source Software

What is open source? I’m not going to go into the long and storied history of how folks make software available online for free, but the core of it is that researchers and software developers sometimes want to make the code that forms their software available for free for others to build on, use, and form communities around. A big example of open-source software is the Linux operating system. Linux-based OSes form the foundation on which are built most websites and internet-connected services you use daily. So, sometimes open-source software can have huge impacts.

Anyway, it seems like a great opportunity to give back from my research outputs, right? Just throw my code up somewhere online and tell people and we’re done.

Well, yes and no. Good open-source projects need a lot of work to get them ready. I’m not going to say I have mastered this through my outreach work, but I have learned a lot.

First, where are you going to put your code? I used GitHub in the end, given it is one of the largest sites to find repositories of open-source software. But you might also want to consider GitLab, Source Hut, or many others.

Second, each site tends to have guidance and best practices on how to prepare your repository for the public. In the case of GitHub, when you’ve got research code you want to share, in addition to the code itself you want to:

      • Write a good readme document.
      • Decide on an open-source license to use and include it in the repository.
      • Create a contribution guide if you want people to contribute to your code.
      • Add a CITATION.cff file to help people cite the code correctly in academic writing.
      • Get a DOI (https://www.doi.org/) for your code via Zenodo (great guide here: https://guides.lib.berkeley.edu/citeyourcode) so it can be cited correctly.
      • Create more detailed documentation, such as via GitHub’s built-in repository wiki, or via another method such as GitHub pages or another hosted documentation website.
      • Bonus: Consider getting a custom domain name to point users to that redirects to your code or that hosts more information.

That’s already a lot to consider. I also discovered a couple of other issues along the way:

Academic Code

Academic Code, according to those in industry, is bad. Proving an idea is not the same as implementing it well and robustly, with all the tests and enterprise code separation, etc. That said, I have seen some enterprise code in my time that seems designed to intentionally make the software engineer’s job harder. But there is a grain of truth regarding academic code. My code was (and still is in places) a bit of a hacked together mess. Pushing myself to prepare it for open source immediately focused the mind on places for improvement. Nothing is ever done, but I did find ways to make the code more adaptable and flexible to diverse needs rather than just my own, fixed some outstanding bugs, and even implemented custom editors in Unity so non-programmers would be able to do more with the plugin without having to edit the scripts that underpin it. In addition to making the code better for others, it made it better for me. Funny how we do what’s best for ourselves sometimes under the guise of what’s best for others.

Documenting a Moving Target

No software is ever done. Through trying to break the Curse of Academic Code as above, I rewrote a lot of my plugin. I had already started documenting it though. Cue rewriting things over and over. My advice is split your docs into atomic elements as much as possible, e.g., if using a wiki use one page per component or whatever smallest yet useful element you can divide your system up into for the purposes of communicating its use. Accept you might have to scrap lots and start again. Track changes in your documentation through version control or some other mechanism.

Publicising Your Open-Source Release!

Oh no, the worst bit. You must put your code child out in the world and share it with others. Plenty of potential points to cringe on. I am not one for blatant self-promotion and rankle at the idea of personal brands, etc. Still, needs must, and needs must mean we dive into “Social Media”. I’m fortunate I can lean on the infrastructure provided by the university and the CDT. I can ask for the promotion of my work through official accounts, ask for retweets, etc. But in general, I guess my advice is put it out there, don’t worry too much, be nice to others. If people give you grief or start tearing apart your code, find ways to disambiguate real feedback from plain nastiness. You are also going to get people submitting bug reports and asking for help using your code, so be prepared for how you balance doing some of that with concentrating on your actual research work.

Tutorial Videos

Though I can quite quickly point back to the last section above, and say, “MOVING TARGET!” there is value in video tutorials. Even if you must re-record them as the system changes. For a start, what you see is what you get, or more accurately what you get to do. Showing people at least one workflow with your code and tools in a way that they can recreate is immediate and useful. It can be quick to produce, with a bit of practice, and beats long textual documentation with the occasional picture (though that isn’t without its place). Showing your thinking and uses of the thing can help you see the problems and opportunities, too. Gaps in your system, gaps in your understanding, gaps in how you explain your system are all useful to flag for yourself.

Another nice get is, depending on the platform, you might get an accidental boost to awareness of your work and your code that you’ve released through The Algorithm that drives the platform. Problems with Algorithms aside, ambient attention on your work through recommended videos e.g., on YouTube can be another entry point for people to discover your work and, their attention and interest willing, prompt them to find out more. This can be converted into making contacts from people who might want to use your tools, who might then participate in studies, or it may make people pay attention to your work, which you can use to nudge them into checking out something you are making with your tools, again making them into a viewer/player/potential study participant.

But how do you go about recording these things? Well, let’s unwrap my toolkit. First up, you’re going to need a decent microphone, maybe a webcam too. Depending on how fancy you want to get, you could use a DSLR, but those can be prohibitively expensive. Maybe your research group has one you can borrow? Same goes for the microphone. I can recommend the Blue Yeti USB microphone. It’s a condenser microphone, which means the sound quality is good, but it can pick up room noise quite a bit. I just use a semi-decent webcam with at least 720p resolution, but I’ve had that for years. This is in case you want to put your face on your videos at any point. Just how YouTube do you want to be?

Anyway, you have some audio and maybe video input. You have a computer. Download OBS Studio is my next recommendation. You can get it at https://obsproject.com/. This is a piece of software that you can use to record or stream video from your computer. It even has a “Virtual Camera” function so you can pipe it into video conferencing software like Zoom or Microsoft Teams. Cue me using it to create funny effects on my webcam and weird echoes on my voice. But, in all seriousness, this is a very flexible piece of freely available software that allows you to set up scenes with multiple video and audio inputs that you can then switch between. Think of it as a sort of home broadcasting/recording kit. It is very popular for people making content for platforms like YouTube and Twitch.tv. I’ll leave the details to you to sort out, but you can quite quickly set up a scene where you can switch between your webcam and a capture of your computer’s desktop or a specific application, and then start recording yourself talking through whatever it is you want to explain. For example, in my tutorial videos I set it up so I could walk through using the plugin I created and open-sourced, showing how each part works and how to use the overall system. Equally, if you aren’t confident talking and doing at the same time you could record your video of the actions to perform, and then later record a separate audio track talking through what you are doing in the video. For the audio, you might want to watch the video as you talk and/or read from a script, using something like Audacity, a free tool for audio recording you can download from https://www.audacityteam.org/.

Which brings me on to my next piece of advice. Editing! This is a bit of a stretch goal, as it is more complex than just straight up recording a video of you talking and doing/showing what you want to communicate. You could just re-record until you get a good take. My advice in that case would be to keep each video short to save you a lot of bother. Editing takes a bit more effort but is useful and can be another skill you can pick up and learn the basics of with reasonable effectiveness. Surprisingly, there is some excellent editing software out there that is freely available. My personal recommendation is Davinci Resolve (https://www.blackmagicdesign.com/products/davinciresolve/), which has been used even to edit major film productions such as Spectre, Prometheus, Jason Bourne, and Star Wars: The Last Jedi. It is a serious bit of kit, but totally free. I also found it relatively simple to use after an initial bit of experimentation, and it allowed me to cut out pauses and errors, add in reshot parts, overdub audio, and so on. This enables things like separating recording the actions you are recording from your voiceover explanation. Very useful.

Next Steps

My public engagement is not finished yet. My research actively benefits from it. Next up I intend to recruit local and remote game developers in game jams that use the plugin, specifically to evaluate the opportunities and issues that arise with its use, as well as to build an annotated portfolio as part of my Research through Design-influenced methodology.

Conclusion

So, there you have it. Ways I’ve tried to get my research out to the public, and what I plan to do next. I hope the various approaches I’ve covered here can inspire other PhD students and early-career researchers to try them out. I think some of the best value in most of these comes from asynchronicity, with a bit of planning we can communicate and share various aspects of our research in ways that allow a healthy work-life balance, that can be shaped around our schedules and circumstances. As a parent to a young child, I know I’ve appreciated being able to stay close to home and work flexible hours that allow me to tackle the unexpected while still doing the work. If there is one thing, I want to impress upon you, it is this: make your work work for you, on your terms, even if it involves communicating your work to tens, hundreds, thousands of people or more. Outreach can take any number of forms, as long as you are doing something that gives back to or benefits some part of society beyond academia.

You can get the Unity plugin here: https://github.com/lukeskt/Reactive-Mise-en-scene

My Internship at Capital One

post by Ana Rita Pena (2019 cohort)

Interning at Capital One

Between May and October 2021 I held a part-time internship with my Industry Partner, Capital One. Capital One is a credit card company that launched its operations in the UK in 1996 and has its parent company in the US. The company is known for being technology driven and in the specific UK case focusing on credit building cards as their main product.

My internship with Capital One UK consisted of working on several projects as part of their Responsible AI initiative due to my interest in topics related to “ethical” machine learning and FACct/FATE (Fairness, Accountability, Transparency and Explainability).

The Responsible AI initiative initially consisted of three projects: the Trustworthy Algorithm Checklist, Global Methods and Individual Level Methods. The Trustworthy Algorithm Checklist project was already under way when I joined the company in May 2021. The project consisted of creating a checklist for Model Developers to complete during the model development process, in order to instigate some reflection and mitigation of ethical risks and consequences associated with the new model. The Global Methods project was subdivided in two parts. Generally, the project aimed to evaluate different explainability methods and provide some guidance and recommendation on different methods to be adopted internally by the Data Science team. The first part consisted in interviewing stakeholders from different departments to have a better understanding about which information each of them needed about the model and the second part consisted of a technical evaluation of the tools. Finally, the third project, Individual Level Methods, aimed to explore how consumers understand different explainability methods to represent their individual outcome. This third project never went ahead due to lack of time.

On my day to day, I worked within the Data Science Acquisition team as my manager was based on this team and he spent a percentage of his time working on the Responsible AI Initiative, however my workflow was separate to the rest of the team. Being able to attend the team’s meetings allowed me to have a better understanding of the workings of the company and the processes involved in model development and monitoring.

In the following sections I will describe in more detail the two projects I worked on as well some general reflections and thoughts on the internship.

Trustworthy Algorithms Checklist

Over the last few years there have been several stories in the press of algorithms which are implemented and end up having unintended consequences which negatively affect users, for example the UK’s A-level grading algorithm controversy, which ended up affecting students from lower socio-economic backgrounds more negatively than other students by deflating their grades the most. This has led to research in the areas of “ethical” AI to cross into real world applications. The Trustworthy Algorithm Checklist Project aims to design a checklist which will make model developers actively reflect on unwanted impacts of the model they are building as part of the development process. After the checklist completion it would then go through an ethics panel composed of stakeholders from different business departments within the company, for an approval process. The initial draft checklist was divided into three sections: Technical Robustness, Diversity Non-discrimination and Fairness and, finally Accountability.

The next iteration of the checklist design created more sections which were based on the Ethical Principles for Advanced analytics and Artificial Intelligence in Financial Services created by UK Finance, which is a trade association for the UK banking and Financial Sector. Ending up with the following five sections: Explainability and Transparency, Integrity, Fairness and Alignment to Human Rights, Contestability and Human Empowerment and Responsibility and Accountability. It was at this stage that I joined the company and started working on this project. The project involved stakeholders from the legal and compliance department as well as from the data science department. This second iteration of the checklist was trialed with a model that was in the initial stages of development, from this trial it was noticed that most of the answers to the prompts in the checklist were mentioned to already be covered by existing Capital One Model Policy. In order to avoid the Checklist becoming just another form that needs to be submitted, the team decided to put more emphasis on the ethics panel discussion meeting and have the checklist being an initial prompt in the discussion in order to foster critical reflection which is aided by having stakeholders that come from different backgrounds and hence bring different perspectives.

While this project initially only focused on the algorithmic aspect of decision making, the team involved discussed the possibility of expanding the checklist to the process of development of Credit Policies. It is the combination of the Algorithmic Risk Assessment with the Credit Policy which will end up impacting the consumer and hence the need to critically evaluate both these parts.

Explainability Toolbox

Global methods are a set of tools and visualisations to help better understand the way complex machine learning models work at an overall, rather than individual level. These methods can focus on the variables, for example which variables have bigger impact on the result or how different variables are related among themselves, or the general decision rules of the model, for example summarise a complex method by approximating it to a set of simple decision rules.

Including explainability methods on the machine learning work process is quite important as these allow one to verify the behaviour of our model at any time and check if it is working well and as expected.

Figure 1 What Do We Want From Explainable AI?

When this project was first discussed with me it consisted of implementing and comparing different explainability methods and packages implementations (both developed internally as well as open-source tools) in order to propose a set of tools to uniformise the tools used by different teams within the Data Science Department. Due to my interdisciplinary experience in Horizon and partly working within the Human Factors field, I was aware that the technical components of the explainability methods are not the only factor that affect how well technology is adopted and used in an institutional setting. In order to address the social aspect of this task I suggested having a series of interviews with different Stakeholders that interact with the Data Science team, in order to have a better understanding of what information they needed on the models and would like to better understand and on their views on the methods that were currently implemented. Doing these interviews allowed me to understand better what different roles and departments did and how they interacted, as previously I had mainly only interacted with the Data Science department.

From the interviews I learned that stakeholders from more business-related roles as well as higher level roles were interested in being able to translate how changes in the model’s behaviour impact business, e.g. in terms of number of people given loans or in terms of profit or number of defaults. Stakeholders from the technical departments were also aware of the shortcomings of the methods they currently adopted but had not had the time to test alternatives in their workdays. From the interviews I created a list of guidance when presenting technical materials to stakeholders as well as identified several fields to evaluate on the second part of the project.

In the second part of the project, I compiled different packages/libraries (open source and internal packages developed by different C.O departments and countries) to test their methods and give guidance on what could be beneficial to implement across the Data Science Department. During this process I learned that different branches of C.O. in different countries use different coding languages as well as that different models from different Data Science teams had different characteristics due to their different needs and what historically their department and branch have implemented. This meant that oftentimes different teams had to create their packages from scratch to meet the specificities of their model, even if they were using the same tools which could have been avoided if there was uniformisation of the language or model construction approach.

Final Reflections

This was my first time working in Industry and I was very pleasantly surprised with the importance that Capital One puts in research, running internal conferences and having specialised research centres (in the US, which is their biggest market). This was further encouraged by the very open and collaborative work environment, for example the Responsible AI UK initiative I was involved with had regular meetings with other research teams within C.O. working in the same field.

While the company had very good intentions and initiates in different projects just like the ones I worked in, the reality of the scale of the UK branch meant that all of the team (apart from myself) only worked on the Responsible AI initiative 10% of their time on top of their team ’s roles. The Explainability Toolbox project also showcased the drive to optimise processes across departments, even if hard to accomplish at scale due to logistical constraints.

Overall, my internship at Capital One gave me a better understanding of the Consumer Credit Industry and the way different departments come together to be able to provide a financial product to the consumer.

A lesson in remote work and open-source success: exploring emotion classification with farmers at CIMMYT

post by Eliot Jones-Garcia (2019 cohort)

My journey with CIMMYT, the international maize and wheat improvement centre, began shortly before enrolling with the Horizon CDT. In February of 2019, after having recently graduated with an MSc in Rural Development and Innovation from Wageningen University, I found myself in Mexico and in need of a job. I was familiar with the organisation because of its pivotal role in delivering the Green Revolution of the 1960s; an era of significant technological advancement in agriculture enabled by seed breeding innovation achieved by CIMMYT scientists. Since then, they have branched out to engage in several areas of agricultural research, from climate change adaption and mitigation strategies to markets and value chains.

Thus, prior to beginning my PhD research in earnest, I spent 6 months conducting a systematic review of technological change studies for maize systems of the Global South. I worked at the headquarters in Texcoco and gained valuable experience among the various academic disciplines CIMMYT employees use to approach agricultural sustainability. Having forged strong relationships with management staff, they volunteered to support me in my move to Nottingham and transition in research toward ‘digital’ agriculture.

During my first year of research at Horizon, I worked with the CIMMYT staff to conceptualise an internship project. The plan was to head back to Mexico once again in the summer of 2020 to collaborate with scientists there. Unfortunately, however, the unexpected onset of COVID-19 forced me to change plans. At first the work was postponed in the hope the situation would ease but to no avail. I decided to undertake my internship remotely and part-time beginning in January of 2021. In hindsight I was incredibly pleased to have had the initial in-person experience but working at a distance would prove to have its own great lessons.

The goal of my work was to explore different methods of natural language processing, sentiment analysis and emotion classification for analysing interviews with farmers. COVID-19 had not only stunted my travel plans, but all CIMMYT researchers were finding it hard to get to farmers to collect data. These interactions were increasingly taking place remotely, via mobile phones. This removed a significant interpersonal dimension from the research process; without supporting visual context, it became difficult to understand affective elements of conversation. I was given access to a series of interviews with different agricultural stakeholder that had been manually coded according to their use of technology and charged with finding out how these digital tools might aid in analysing audio and textual data.

I approached the task exploring the grounding literature. The first major insight from my internship became how to turn around a thorough and well-argued review for motivating a study in a short time, whilst providing a good understanding for myself and the reader. This yielded a variety of approaches to defining and measuring emotion, selecting audio features for analysis, and modelling tools. I ended up taking a conventional approach, using ready-made R and Python tools and lexicons to analyse text, and a series of widely available labelled datasets to train the model. The second insight from my internship was to engage with different open-source communities and apply available tools to achieve my desired goal.

In combination with working remotely, these activities gave me great confidence to independently deal with tasks, to seek and gain certain skills and to utilise them with support of experts to a high degree of quality. More than anything I feel like this internship taught me to apply my academic abilities to unpack and explore problems in a concise and specific way and to deliver what CIMMYT want in the form of outputs, that is actionable insights that can be applied in a Global South context, and for motivating future research.

In light of this, I produced a structured analysis and report for my CIMMYT supervisors which was then published as an internal discussion paper in December of 2021. Findings from the study indicate that sentiment analysis and emotion classification can indeed support remote interviews and even conventional ethnographic studies. By revealing several biases related to transcription and translation of text, the analyses suggested greater consistency in future study to mitigate any unreliability this may introduce. In terms of affect, there was a clear relationship between different sources of data; dis-adopters of technology, or those who rejected use, were shown to be angrier relative to the rest of the sample, whereas new adopters expressed greater joy and happiness. While this confirmed our expectations there were also unusual insights, for example, female farmers were less fearful in the adoption of technologies. It is expected that in future this research may contribute to better targeted interventions, making technologies available to those who are more likely to make use of them.

Moving forward, I continue to work with my industry partner in smaller projects and look forward to collaborating with them in a professional capacity. This experience has been a great help in my PhD for focusing the direction of my research, highlighting the role of data in shaping how knowledge is created and how that plays into agricultural development. It has helped me to manage tasks and to allocate time wisely, and to produce industry standard work to provide benefit to farmers. The final version of the work is undergoing peer review and I hope to see it published in the near future.

If anyone would like to learn more about this work or would like to contact anyone at CIMMYT, please do not hesitate to contact me at eliot.jones@nottingham.ac.uk.

Many thanks for reading!

Eliot

Trusting Machines? Cross-sector lessons from healthcare and security

Royal United Services Institute (RUSI): Trusting Machines? Cross-sector lessons from Healthcare and Security, 30 June – 2 July 2021

post by Kathryn Baguley (2020 cohort)

Overview of event

The event ran 14 sessions over three days which meant that the time passed quickly.  The variety was incredible, with presenters from varied multidisciplinary backgrounds and many different ways of presenting.  It was great to see so many professionals with contrasting opinions getting together and challenging each other and the audience.

Why this event?

My interest in this conference stemmed from the involvement of my industry partner, the Trustworthy Autonomous Systems hub (TAS), and wanting to hear the speaker contributions by academics at the University of Nottingham and the Horizon CDT.  The conference focus on security and healthcare sectors was outside my usual work, so I thought the sessions would provide new things for me to consider. I was particularly interested in gaining insights to help me decide on some case studies and get some ideas on incorporating the ‘trust’ element into my research.

Learnings from sessions

Owing to the number of sessions, I have grouped my learnings by category:

The dramatic and thought-provoking use of the arts

I had never considered the possibilities and effects of using the arts as a lens for AI, even as a long-standing amateur musician. This is a point I will carry forward, maybe not so much for my PhD but training and embedding in my consultancy work.

The work of the TAS hub

It was great to learn more about my industry partner, particularly its interest in health and security.  I can now build this into my thoughts on choosing two further case studies for my research.  Reflecting on the conference, I am making enquiries with the NHS AI Lab Virtual Hub to see whether there are relevant case studies for my research.

Looking at the possible interactions of the human and the machine

I believe overall, in a good way, I came away from the event with more questions for me to ponder, such as:  ‘If the human and the machine were able to confuse each other with their identity, how should we manage and consider the possible consequences of this?’ My takeaway was that trust is a two-way street between the human and the machine.

Aspects of trust

I’d never considered how humans already trust animals and how this works, so the Guide Dogs talk was entirely different for me to think about; the power the dogs have and how the person has to trust the dog for the relationship to work.  Also, the session of Dr Freedman where he discussed equating trust to a bank account brought the concept alive too.   Ensuring that the bank account does not go into the ‘red’ is vital since red signifies a violation of trust, and recovery is difficult. Positive experiences reinforce trust, and thus there is a need to keep this topped up.

The area of trust also left me with a lot of questions which I need to think about how they will feature in my research, such as ‘Can you trust a person?’, ‘Do we trust people more than machines?’ and ‘Do we exaggerate our abilities and those of our fellow humans?’  The example of not telling the difference between a picture and a deepfake but thinking we can is undoubtedly something for us to ponder.  As the previous example shows, there is a fallacy that a human is more trustworthy in some cases. Also, Prof Denis Noble suggested that we have judges because we don’t trust ourselves.

I have reflected on the importance of being able to define trust and trustworthiness.  Doctor Jonathan Ives described trust as ‘to believe as expected’, whereas trustworthiness is ‘to have a good reason to trust based on past events’.  The example he gave of theory and principle helps show this point in that the principle of gravity and the apple falling from the tree; however, we cannot view AI in the same way.

article on trust and trustworthiness

The discussion around trust being an emotion was fascinating because as a lawyer, it made me question how we could even begin to regulate this.  I also wondered how this fit in with emotional AI and the current regulation we have.  I believe that there may be a place for this in my research.

The global context of AI

This area considered whether there is an arms race here, and it was interesting to ponder whether any past technology has ever had the same disruptive capacity.

The value of data in healthcare

There were so many genuinely great examples showing how NHS data can help people in many situations, from imaging solutions to cancer treatment.  I also found the Data Lens part very interesting, enabling a search function for databases within health and social care to find data for research purposes.  The ability to undertake research to help medical prevention and treatment is excellent.  I also found it interesting that the NHS use the database to reduce professional indemnity claims. I wondered about the parameters in place to ensure the use of this data for good.

slide on Brainomix information

The development of frameworks

The NHSX is working with the Ada Lovelace Foundation to create an AI risk assessment like a DPIA. NHS looking to have a joined-up approach between regulators and have mapped the stages of this.  I am looking for the mapping exercise and may request it if I’m unable to locate it.  I was also encouraged to hear how many organisations benefit from public engagement and expect this from their innovators.

slide on AI ethics - A responsible approach

Overall learnings from the event

  • Healthcare could derive a possible case study for my research
  • I have more considerations to think about how to build trust into my research
  • Regulation done in the right way can be a driving force for innovation
  • Don’t assume that your technology is answering your problem
  • It’s ok to have questions without answers
  • Debating the problems can lead to interesting, friendly challenges and new ideas
  • Massive learning point: understand the problem