Secretum Blockchain-based Messaging

In a previous blog, I talked about the opportunities and challenges of building a blockchain-based messaging application. There are many efforts trying to bring blockchain-based messaging to the market. Secretum is one recent example. I took a look at their white paper to understand more about how they designed the application. Although they provide informative information about the motivation of blockchain-based messaging and aspects of their design, I hope that a more thorough white paper will follow to understand how Secretum works in detail.

Before I begin with the technical aspects, the way Secretum motivates blockchain-based messaging includes some very nice examples and information worth noting. They reference an NPR article about how data of hundreds of millions were compromised in a breach on Facebook — showing the scale of the risk of such breaches. They mention an interesting point about how messaging apps like Telegram have seen an increase in usage by hackers sharing stolen/illegal information. Interesting about these arguments against centralized messaging is whether a decentralized solution can avoid such problems. It appears to me that the risk of such problems is the same (if not higher) in decentralized deployments which would require more effort to solve.

Now let’s get to the technology/design side. At a high level, Secretum’s system model consists of the smart contracts on blockchain (specifically on Solana), a network of nodes outside of the blockchain to handle communication of messages and storage, and the users. As expected, the messaging between users utilizes end-to-end encryption to ensure that no party can know the content of messages. However, this is something that is already implemented by many messaging apps including ones that are not decentralized. Secretum motivates the need for their decentralized solution by mentioning other potential breaches of privacy. They provide the following examples:

(1) Requiring a phone number for registration, which is done by apps such as WhatsApp and Signal. The use of phone numbers as the identity/identifier of a user makes it possible for hackers to target users based on their phone numbers. This coupled with vulnerabilities in WhatsApp can lead to hackers gaining access to a user’s phone by knowing their phone number only– as some recent news has shown.

(2) Collection of IP addresses (and other connectivity information). Sometimes this is required by the app (e.g., Telegram), and sometimes it is something that can be inferred from nearby nodes (e.g., Signal).

(3) Reviewing of message contents which indicates that messages are monitored which puts them at the risk of insider or outsider attacks. The white paper indicated that this is the case in WhatsApp.

Secretum uses these points above as motivation and claims that they do not suffer from these problems. This is done by the following protocol: There is a number of off-chain Secretum nodes that act as a middle layer between users and the Solana blockchain. An off-chain Secretum node maintains a subset of user accounts (and their IP addresses). Also, a node contains staked SER token (the Secretum token) that is utilized to initiate communication. When a user x wants to communicate with another user y, A node X (corresponding to user x) communicates with node Y (corresponding to user y) by first requesting permission from the smart contract. The smart contract “routes” this request to Y. The nodes then connect the users together as they know their IP addresses.

A big question mark that I have for this design is how the off-chain Secretum nodes are not posing a threat similar to centralized servers in existing non-blockchain messaging apps. I couldn’t find a clear answer in the white paper (let me know if you found the information in the white paper or elsewhere.) In particular, IP addresses are still being maintained by these nodes. Also, they store messages — that although encrypted, reveal the communication patterns of users.

In addition to the messaging aspect of Secretum, they push the idea of sending tokens in a decentralized way. Unlike centralized exchanges (such as Coinbase), a decentralized exchange does not rely on centralized entities to make cryptocurrency transactions. Rather, they utilize smart contracts to perform such operations. Secretum wants to facilitate the operation of decentralized exchanges as an added feature of their messaging app. Although it might seem that this function is not related to messaging and might be forced into the mission of Secretum, I think that integrating payment/exchange functions is a good idea with messaging apps. A non-trivial part of our interactions with others revolves around payments, whether it is paying for a service or dividing the cost of a meal with friends.

In summary, it was very interesting to know about what Secretum do, how they position themselves, and the economic/token model they use for decentralized messaging. I am going to keep an eye out for more details about their design, especially how Secretum nodes operate and coordinate with each other and other components of the system.

Is There a Future for Blockchain-based Messaging Apps?

Blockchain enables building decentralized applications in a way that has not been possible before. This makes me — and others excited by this new technology — think continuously about the applications that can really benefit from this new decentralized infrastructure. As many blockchain enthusiasts know, there are not many clear cases where building a blockchain-based application would be superior to building an application using existing technologies such as cloud computing and web 2.0. The reason for this can be grouped into:

(1) Blockchain-based apps incur high overhead: To build a decentralized application on blockchain, you must be prepared for both the monetary cost of deploying and running smart contracts as well as the performance overhead as the latency to interact with smart contracts is high.

(2) Does blockchain bring a clear benefit to the application? Compared to other technologies, blockchain offers unique features such as decentralization and a natural coupling with monetary (cryptocurrency) functions. However, are these (and other) features of blockchain important for the users of some applications we consider?

Some time ago, I started thinking of messaging as a blockchain-based decentralized application. Would messaging be a good application in this domain? There is no doubt that messaging is one of the most widely used applications nowadays, across services such as WhatsApp, Telegram, Signal, and others. This is in addition to applications that have messaging as an integral function in their operation, such as online social networking applications. The question I ask is whether the existing (non-blockchain) structure of messaging apps is enough for consumers, or is there a potential for some users to find benefits in a blockchain-based messaging app?

To answer this question, let’s take the two points above on how to evaluate whether a blockchain-based app can be superior to traditional (cloud-based) apps. First in terms of costs involved. Messaging is an inherently decentralized/peer-to-peer-type application. When I want to message a friend, it is sufficient for my message to go straight from my device to their device. In fact, this is how many existing messaging applications operate today. This means that the majority of the operation of a messaging app is performed between the devices of the users (outside of the blockchain). This means that the costs involved are reduced. Operation on blockchain can be limited to being the address book of users.

Although this means that the core of the messaging app may not incur significant cost/performance overhead, this is only the case if we are willing to forgo some nice features that cloud-based messaging apps provide us. One feature is the ability to store messages on users’ behalf to make the delivery of messages more reliable and efficient. This, however, is not necessarily a fundamental problem or limitation to blockchain-based apps. Currently, there is significant progress in off-chain and layer-2 solutions that may enable supporting such functionalities without threatening the decentralized nature of the blockchain-based application (This is another topic that I may expand on in a future blog post.)

The second point to consider is whether blockchain offers a unique advantage that would incentivize (at least some) users to switch to a blockchain-based messaging application. Let’s see some of the unique aspects of blockchain and whether they are relevant to messaging apps. The first aspect is “decentralization”; the absence of a centralized entity to control the application beyond the deployed smart contracts. Messaging apps — similar to online social networks — are the topic of continued debates about data privacy and protection. A centralized entity that controls the messaging application is likely to find ways to profit from such control. The common way nowadays is through monetizing users’ data, through ways such as ads and selling to third-party companies. Decentralization can be a way to avoid such monetization of data. However, it remains unclear whether a significant percentage of the population would care about this issue enough to leave traditional (non-blockchain-based) messaging apps and accept the possible shortcomings of decentralized applications. Furthermore, centralization is actually desired in certain situations. For example, a centralized entity that can help prevent and control spam would be welcomed by many users.

Another advantage of building on blockchain is the coupling with cryptocurrency which may enable an easier way to exchange payments. In the context of messaging apps, this may offer an opportunity to build messaging apps that would have the functionality to send/receive payments without extra steps and integrations. This seems to be a feature that is anticipated by many users and several existing messaging and online social networking applications have already integrated or started integrating payments into their platforms. Whether the stronger coupling of payments in blockchain-based applications would make such integration easier and more natural for users remains to be proven as blockchain-based applications gain more adoption.

From a high-level view, there are some unique properties in blockchain that may make a decentralized messaging app attractive to a segment of users. In the blog, I mention some of these reasons as well as why they might not be strong cases for adoption. We should not think of these reasons as ones to prevent us from thinking about this further. Rather, I see them as a collection of goals to try to overcome in the quest of building blockchain-based messaging applications. The recent advancements in off-chain solutions, usability, and web3, may provide the infrastructure that would bring out the benefits of blockchain-based messaging apps.

Edge-Cloud Systems at ICDE 2021

In the past three years, we at EdgeLab have been working on the problem of building edge-cloud systems — systems that span both cloud and edge computing infrastructure. Our focus on this edge-cloud architecture is due to that it appeared to us that it is a promising way to enable emerging real-time edge and IoT applications that require both fast response (that we may get by placing resources close to users at the edge) and high-performance compute and integrity (that we may get from cloud resources). These applications — such as ones based on interactive and collaborative mobile applications, Augmented/Virtual Reality, and smart vehicles/drones — are incredibly cool and have undergone extensive research for decades. However, we believe that there is a technology gap that is standing between the advances made in these applications and exploring their full potential in real-life applications; that gap is the lack of data management systems that are designed specifically for an edge-cloud architecture, enabling these applications access to an illusion of both fast and high-performance compute.

b11

This observation has been made by many other teams as well and has resulted in research and prototypes that are exploring this space of edge-cloud systems. In this year’s CIDR, authors of VergeDB argue that the emergence of IoT applications requires “a significant amount of data processing to happen on edge devices” [1]. However, they observe that their edge-based system is not going to be performing all tasks of the overall system, rather, that the edge-based system should work cooperatively with the cloud system by performing in-situ analytics and compression. Likewise, Neurosurgeon’s authors explore edge-cloud cooperative processing for deep neural networks, where it is found that splitting the layers of a deep neural network into an edge part and a cloud part yields better performance and energy efficiency [2]. These examples, and many others, explore edge-cloud cooperative processing and demonstrate the efficacy of such an architecture for analytics and machine learning-based workloads. However, the opportunities of edge-cloud computing go beyond these applications.

The most recent Seattle Report on Database Research [3] motivates the opportunities in edge-cloud systems in two forms. The first is under the heading “Edge and cloud”, where it argues for the need for novel data processing and analytics techniques to accommodate the proliferation of IoT devices with limited capabilities. The second is under the heading “Hybrid cloud” where it argues for the need for new seamless systems that enable “on-premise” (which may represent an edge deployment) and cloud systems to work cooperatively. These two forms highlight the two main challenges in building edge-cloud systems: (1) the asymmetry of the capability of resources on the edge and on the cloud, and (2) the need for designing a single control plane that seamlessly connects edge and cloud resources.

These challenges motivated our work in EdgeLab, where we study distributing systems across edge and cloud resources for various problems such as data storage and coordination as well as various applications such as regular Internet/cloud applications as well as ones based on machine learning and video analytics. Our most recent published projects are appearing in ICDE 2021 next week. In the following, I will give a brief overview of our three papers and pointers to where you can read/hear more about them.

Diagram Description automatically generated

The first project is CooLSM that is led by Natasha Mittal. In CooLSM, we study distributing storage across edge and cloud resources [4]. Our goal is to place storage tasks that need to be done in real-time — such as data ingestion and caching — close to users at the edge, while placing tasks that require more compute and storage — such as performing read queries, storing the full copy of data and recovery — at the cloud. We observe that Log-Structured Merge (LSM) trees capture this trade-off. In terms of edge concerns, LSM trees are designed to perform ingestion fast and keep the most recent data at the beginning of the path of execution (which can act as an LRU cache). In terms of cloud concerns, LSM leaves the heavy lifting (compaction and storage of large memory segments) to the higher LSM levels at the end of the path of execution. Starting from this observation, we build our distributed CooLSM storage by deconstructing the LSM tree structure and placing the low levels (responsible for ingestion and maintaining the “cache”) at the edge, and placing the rest of the levels (responsible for slow compaction and storage of large segments of data) at the cloud. We then augment this design by building a component to store backups that are placed at the cloud to process read-only queries away from nodes performing ingestions and compaction. The paper also explores using this deconstruction to scale compaction by dividing across machines. The combination of deconstructing LSM trees and adding backup nodes opens interesting questions in the performance-accuracy trade-off of such a distributed design. In the paper, we provide some ways to think of the consistency of operations of these distributed components.

The second project is WedgeChain [5]. WedgeChain tackles the challenge that edge nodes might be untrusted. This can be due to edge nodes running on infrastructure that is outside the trust domain of the application, or due to the challenges of less-capable edge devices that might be more susceptible to arbitrary software/hardware errors and malicious breaches. Existing solutions would do one of two things: (1) edge nodes would coordinate via a byzantine agreement protocol to tolerate malicious/arbitrary behavior, or (2) a trusted node — typically centralized in a public or private cloud — would authenticate every operation and data item and then send them to edge nodes to be served to users. The authentication — through data structures such as Merkle Trees — allows the potentially untrusted edge node to provide proof of the authenticity of the served data items. Both these approaches incur significant overhead, either due to extensive coordination of byzantine fault-tolerant protocols or due to large wide-area latency between the edge and the trusted entity. WedgeChain proposes relaxing trust requirements in the following way. Instead of having every data item to be authenticated synchronously with the trusted entity, authentication and certification of operation in the edge will be done lazily. Specifically, edge nodes are going to be allowed to act maliciously, however, record-keeping mechanisms will guarantee that malicious activity will be eventually detected. This eventual detection of malicious activity might be enough to deter nodes from acting maliciously in certain applications (for example, if the identity of the operator of the edge node is known and penalties on malicious actions outweigh the benefit of the malicious act.) We take this observation of lazy trust and build an indexing structure at the edge that builds an authenticated data structure lazily at the edge, getting inspirations from the structure of mLSM [6], which enables us to incrementally build an authenticated data structure lazily at the edge.

The third piece of work is Predict and Write (PnW) [7], led by Saeed Kargar. PnW tackles problems that are associated with NVM storage systems on edge devices: lifetime and energy efficiency. Although these two challenges are more significant for edge devices with less-capable hardware (and where maintenance incurs more overhead), PnW is designed to improve the lifetime and energy efficiency of general NVM storage systems. PnW, similar to some prior work, makes the observation that reducing the number of bit flips of a write operation to an NVM segment leads to reducing the wear on the memory segment and reducing the energy needed to perform the write operation. Prior methods proposed techniques to reduce bit flips by manipulating the bits of the written data to better match the content of the memory location that it is written to. Other methods aim to reduce the number of write operations to reduce the wear on NVM storage. PnW proposes to proactively pick memory locations for new write operations judiciously to reduce the number of bit flips. Specifically, PnW will find a memory location that is similar to the data item to be written. To this end, PnW builds a machine learning model that clusters free memory locations into clusters based on their hamming distance similarity. When a new write comes, the model picks a memory location from the cluster that is most similar to the content of the new write operation. PnW demonstrates that proactively choosing the free memory location for a new write operation has the potential of reducing the number of bit flips significantly.

We are excited by this year’s ICDE and are looking forward to presenting and discussing our edge-cloud work as well as the other wonderful papers in the ICDE program.

References:

[1] Paparrizos, John, Chunwei Liu, Bruno Barbarioli, Johnny Hwang, Ikraduya Edian, Aaron J. Elmore, Michael J. Franklin, and Sanjay Krishnan. “VergeDB: A Database for IoT Analytics on Edge Devices.” In CIDR 2021.

[2] Kang, Yiping, Johann Hauswald, Cao Gao, Austin Rovinski, Trevor Mudge, Jason Mars, and Lingjia Tang. “Neurosurgeon: Collaborative intelligence between the cloud and mobile edge.” ACM SIGARCH Computer Architecture News 45, no. 1 (2017): 615-629.

[3] Daniel Abadi, Anastasia Ailamaki, David Andersen, Peter Bailis, Magdalena Balazinska, Philip Bernstein, Peter Boncz, Surajit Chaudhuri, Alvin Cheung, AnHai Doan, Luna Dong, Michael J. Franklin, Juliana Freire, Alon Halevy, Joseph M. Hellerstein, Stratos Idreos, Donald Kossmann, Tim Kraska, Sailesh Krishnamurthy, Volker Markl, Sergey Melnik, Tova Milo, C. Mohan, Thomas Neumann, Beng Chin Ooi, Fatma Ozcan, Jignesh Patel, Andrew Pavlo, Raluca Popa, Raghu Ramakrishnan, Christopher Ré, Michael Stonebraker, and Dan Suciu. “The seattle report on database research.” ACM SIGMOD Record 48, no. 4 (2020): 44-53.

[4] Natasha Mittal, Faisal Nawab. “CooLSM: Distributed and Cooperative Indexing Across Edge and Cloud Machines”. In ICDE 2021.

[5] Faisal Nawab. “WedgeChain: A Trusted Edge-Cloud Store With Asynchronous (Lazy) Trust.” In ICDE 2021.

[6] Raju, Pandian, Soujanya Ponnapalli, Evan Kaminsky, Gilad Oved, Zachary Keener, Vijay Chidambaram, and Ittai Abraham. “mlsm: Making authenticated storage faster in ethereum.” In 10th {USENIX} Workshop on Hot Topics in Storage and File Systems (HotStorage 18). 2018.

[7] Kargar, Saeed, Heiner Litz, and Faisal Nawab. “Predict and Write: Using K-Means Clustering to Extend the Lifetime of NVM Storage.” In ICDE 2021.

معمل الأبحاث ليس مصنعا

عند تخرجي من درجة الدكتوراة قبل ثلاثة أعوام و الانطلاق في بناء معمل بحثي كأستاذ مساعد, كانت لدي ثقة بأني جاهز لهذا الدور الجديد كقائد لمعمل أبحاث. ثقتي كانت بسبب نظرتي أنني أمضيت سنين عديدة كمساعد باحث و طالب دكتوراة فلابد أنه بعد كل هذه السنين أنني الآن جاهز و لدي المعرفة لكيفية عمل و قيادة الأبحاث بشكل مستقل. لكن مفاجأتي كانت كبيرة خلال هذه الأعوام الأولى في كل خطوة أقوم بتعلمها و إدراكي خلال هذه الخطوات كيف أنني كنت جاهلا تماما للفرق بين القيام بالأبحاث و بين قيادة فريق بحثي. في باقي التدوينة أذكر بعض ما لاحظته:

Untitled presentation (2)

أولا: الباحث الجيد ليس بالضرورة مدير أبحاث جيد

الملاحظة الأولى هي أن المهارات المكتسبة كباحث هي جزء صغير من المهارات التي يجب أن تكون لديك كمدير لمعمل أبحاث. كباحث, أنت تكتسب مهارات البحث عن مشكلة بحثية و حلها بشكل مستقل و أن تكون عضوا فعالا في مجتمعك البحثي. كمدير معمل أبحاث, هناك أمور أخرى يجب العمل عليها. مثلا, يجب أن يكون هناك نظرة مستقبلية و بعيدة لأهداف المعمل البحثي. كباحث دكتوراة, دورة الحياة البحثية هي عدد قليل من السنين, و لكن كمعمل بحثي دورة الحياة هي العشرات من السنين. قيادة معمل بحثي بنفس دورة باحث الدكتوراة سيجعل الأفق و التأثير البحثي محصورا في هذه الفترة الزمنية الضيقة (عدة سنين) و تكرارها مع الدفعات القادمة. أما العمل بنظرة و هدف مستقبلي يمكن القيام بأعمال أعمق و التي تتطلب فترات زمنية أطول.

 مثال آخر هو أن اكتساب مهارات العمل البحثي مختلفة عن مهارات تدريب البحثين الجدد عليها. باقي النقاط تلمس بعض جوانب هذا الاختلاف.

ثانيا: معمل الأبحاث ليس مصنعا

أحد المهام الرئيسية و أكثرها تأثيرا برأيي هو تدريب الباحثين الجدد في فريقك. التدريب الجيد قد يوفر لك باحثين مستقلين يستطيعون عمل و قيادة أبحاث مبتكرة تخدم الهدف المستقبلي لمعملك البحثي. أحد أكثر الأخطاء التي أرى وقوعها في معامل الأبحاث هو التعامل مع الباحثين كعاملين في مصنع. في هذه الحالات يقوم مدير الأبحاث بالقيام بجميع العملية الإبداعية و يقوم فقط بتوزيع مهام و تجارب محددة لفريقه البحثي من دون أن يكون لهم دور في العملية الإبداعية و الطريق الذي سيأخذه المشروع. الوقوع في هذا الخطأ إما سببه هو النظرة الاستعلائية و عدم الثقة في الباحثين الجدد و أن بإمكانهم المساهمة في العملية الإبداعية أو بسبب التكاسل و عدم رؤية الفائدة من بذل الوقت لتدريب طلبة سيذهبون بعد عدة سنين.

المشكلة من هذه الطرق في التفكير أنها تؤدي لخلق فريق بحثي يعمل كمصنع. لكن الأبحاث و العمل بالأبحاث عملية إبداعية أكثر من كونها عملية ميكانيكية. الفريق البحثي “المصنعي” يعاني من المشاكل التالية:-

أ. مدير الفريق البحثي هو المصدر الوحيد للأفكار: المشكلة هنا أن مدير الفريق البحثي هو شخص واحد و بالضرورة سيكون تفكيره مقيد بخبرته – و في الكثير من الأحيان, مدير الأبحاث كون أن له العديد من السنوات يعمل في مشاكل بحثية متقاربة يجعل تفكيره منصبا في طرق معينة و مكتنز لأعتقادات بحثية قد تكون أصبحت خاطئة. وجود عناصر جديدة (خضراء) يجدد من الأفكار و يؤدي لاكتشاف اتجاهات جديدة.

ب. إحباط أعضاء الفريق البحثي: الباحثون, عند انضمامهم لبرنامج الدكتوراة, يدفعهم شغف الاكتشاف و الدخول لعالم الأبحاث و التفكير الإبداعي. إذا كبحت هذا الشغف فإن الطالب أو الطالبة سيذهب حماسهم و تفانيهم في العمل و سيكون حتى العمل “المصنعي” غير ممكن. باحثوا الدكتوراة يواجهون الكثير من الصعوبات و الضغوطات, و الشيء الوحيد الذي يحفزهم للاستمرار هو أنهم يقومون بالعمل الإبداعي و كونهم جزء من الرسالة العلمية. إذا أزلت هذا الجزء, فالنتيجة الحتمية هي إحباط و فشل في المشروع الدراسي و البحثي.

ثالثا: القناعة بأن قيادتك لها تأثير

أحد الأخطاء الأخرى التي يقع فيها الكثير هو اعتقادهم أن مخرجات الطالب تعتمد على الطالب فقط و ليس على قيادة مدير الفريق البحثي. القاء اللوم على “الحظ” يمنعك من تطوير مهارات قيادة الأبحاث. و للوهلة الأولى قد يبدو ذلك صحيحا – اذا كان الطالب جيدا, فهو سيستطيع القيام بالأبحاث و التعلم باستقلالية. لكن الذي تخسره أن العديد من الباحثين أصحاب الإمكانيات قد لا تتاح لهم الفرصة لاكتشاف و استغلال امكاناتهم. ليستطيع أن تكون لديك قناعة بتأثيرك في مخرجات فريقك, عليك رؤية معامل الأبحاث في مجالك و كيف أن بعضها لها تأثير أكبر من الآخرين بمراحل حتى و إن كانت في نفس مستوى و تأهيل المعامل الأخرى. طريقة أخرى لتصبح لديك هذه القناعة هي أن تجرب أخذ هذا الدور القيادي في تحفيز و دعم طلبتك و الخوض في مشوار اكتشاف مواهبهم و ترى النتيجة بنفسك.

رابعا: الأبحاث ليست كل شيء

الأبحاث (كأوراق أو مشاريع) هي ليست كل شيء عند الحديث عن معمل بحثي. الهدف الكبير و البعيد و التأثير أهم بمراحل. و تحقيق هذه الأجزاء هي مهمة مدير الفريق البحثي بتواصله مع المجتمع البحثي و العملي, و اتساع دائرة الاطّلاع لاستشراف المستقبل و الأهداف الملائمة للمعمل البحثي.

خامسا: القراءة و السؤال

أخيرا, قيادة معمل بحثي هو موضوع مشترك للعديد من الأكاديميين و ستجد الكثير من المصادر للاستئادة من تجارب الآخرين. القراءة و السؤال و التعلم من تجارب الآخرين استثمار ضروري يجب أن تقوم به. الأغلب من الباحثين لا يكون جزء من دراستهم و عملهم البحثي معرفة كيفية قيادة معمل بحثي, و هذا يعني أن المهمة مناطة بك لتقوم بتعلمها بنفسك.

Your Academic Job Talk During the Pandemic

As we get closer to the academic job application deadlines, it is time to start thinking about your academic job talk. Preparing a job talk is stressful. Whether or not you get the position is largely due to your performance during the talk. And there are many hard decisions that you need to make that might make or break you talk. This year, there is an added stressor; academic job talks are likely going to be virtual like last year. Preparing your virtual academic talk is not the same as preparing an in-person job talk. I had the opportunity to experience some of the challenges of preparing and doing a virtual academic job talk last season, and in this blog post I will share some of the things that worked and some of the things I wish I had done.

 work-from-home-4987741_640

1. Being present

Whether you think it is a good or a bad thing, the decision on hiring in an academic position is largely due to the candidate presence and the impression everyone gets. Being in this largely social profession, you probably have acquired many soft skills about making a good impression. But, these might not be applicable to an online setting, or worse, they might backfire. The best way to overcome this is to experience doing online talks and visits as well as get feedback from mentors and colleagues. Also, record yourself from your computer that you will use giving part of the talk to see what others will see. Can they see the slides, your camera, and hear your voice clearly? Unfortunately, we might not realize that we are using a bad-quality cam or microphone, which might have a significant impact (think of a low-quality microphone that adds noise, which can be really distracting during a talk.)

2. Home not alone

Well, you might not be the only one doing remote work during your interview. If you have a partner or children, it is most likely that they will have to be home for some or all the time while you are in an interview. This would create many opportunities for your small children to be part of the interview. You might stress about this — afraid that it will be deemed unprofessional –and try to avoid it at the cost of added anxiety to yourself and your family. Having been part of both ends of these situations, it is nothing to stress about – it is almost expected to eventually happen for a candidate doing multiple day-long interviews. Not stressing about it and handling it gracefully would actually make a good impression — and who does not like to see a cute child’s face in the middle of a work day! But, it is also good to prepare. If your child is old enough, talk to them about your interview and that if they must interrupt that they do so in a non-disruptive way (depending on their age, for example, I told my 6-year old to enter the room, sit on the bed in front of me and wait for me to talk to him.) For younger kids, they probably would jump right onto your lap. Introduce them briefly, prepare a personal joke/anecdote about this situation (“personal” because by now everyone have heard every generic one already ;-)). And, sometimes, it is better to ask to leave an interview/meeting for 5 minutes to resolve any interruptions rather than being distracted for the whole meeting. Another thing you might ask for is breaks during the day, for example, during an important remote class/exam your child is taking or an important meeting your partner is doing.

In general, think about all the possible interruption scenarios and have a protocol on how you would handle them — do not leave this to the day of the interview. And do not feel that you are stuck to the structure of the interview that the host department sends you. Tell them what you need.

Having said this, it seems that many workplaces have started establishing protocols for faculty/students who need a personal space for teaching or other activities. It does not hurt to ask if you can work from campus during your interviews – or at least during the talk, if you feel that it is best for you.

3. A Talk to an invisible audience: Push through the doubts

One of the unexpected challenges that I have faced during my online job talk is that I cannot see the faces of the audience! We rely on the facial expressions of the audience to know whether what we have said was clear, whether it is trivial, which guide us on expanding, repeating, or skipping parts. Giving the talk to a bunch of avatars was definitely an isolating experience and might exert doubts and anxiety about the reception of your talk. What’s the solution? First, find ways to engage the audience. In parts where there is a transition or insight, ask a relevant question to the transition or insight. Prepare such questions in multiple places in your talk, and ask it if you feel that you did not hear from the audience for some time. Hearing the audience’s feedback and answers would be reassuring about how the talk is received. Second, and most importantly, try to ignore these doubts and push through the talk with your best and most enthusiastic self. Having doubts might reduce your performance and confidence.

4. The technical difficulties

Ah! Technical difficulties. These are the worse. You cannot control them and they seem to happen at exactly the worst time. You should also have a plan and protocol that you would follow if some technical difficulty happens, from small ones to disruptive ones. For example, what will you say if someone’s connection is dropping and you cannot hear all what they say; how many times are you willing to retry connecting to a meeting if it keeps dropping, before you give up and send an apologetic email. Something that is frequent in online meetings is delays. You should be aware if a delay is happening and to realize that you might be speaking over your interviewer unintentionally. Acknowledging the situation would help relieve the situation.

5. Take advantage of the virtual setting

Many of the points above are about challenges and difficulties of doing remote interviews. But, there are advantages, too. Beside the obvious ones, such as not having to travel, deal with jetlag, and being in a comfortable space, there are things that you can do that wouldn’t be possible otherwise. One that I really like is that no one knows what is on your screen and desk. You can have notes, pointers, and other helpful material. For example, one thing I did is that I wrote a transcript of the first couple of slides, which is a part that I always struggle with when delivering talks. Similarly, when meeting interviewers, you can swiftly share your screen to show off any cool demos related to what you are discussing. Prepare to take advantage of this in a way relevant to your talk and work.

I wish you all the best of luck and I hope my experience helps! For anyone interested, I also include the presentation slides that I used in my interview:

https://docs.google.com/presentation/d/1VDU5BsomGT3pDkIsbdxDcFf-WONzqleflgPyywJz1Oo/edit?usp=sharing

You might be also interested in my previous post about writing research and teaching statements: http://nawab.me/blog/?p=685

Research and Teaching Statement Samples

It is almost October; The month where graduating students and postdocs realize they need to start working on their job applications. Those looking for academic jobs, need to write research, teaching, and diversity statements. When I was applying, it really helped me to see the application material of other applicants. I share below my application material for two application cycles, and I hope some would find it helpful.

statements_blog

2016-2017 application: These are my statements when I was fresh out of the Ph.D program

Research Statement

Teaching Statement

2019-2020 application: These are my statements after spending a couple of years as an Assistant Professor. Having served in recruitment committees and reading many research statements, I tried to mimic the best of what I saw and what have worked for others.

Research Statement

Teaching Statement

Diversity Statement

Other material: In addition to the statements, below is a sample of a cover letter and CV (the CV is a link to my current CV, but it is very similar to what I used in the application)

Cover letter sample

C.V.

Building Empathy towards Students in Introductory Courses

“A whole week to learn if-else statements!” I told my colleague in dismay while I was preparing to teach my first course on “intro to programming in python”. I took the syllabus from the professor who taught the course many times before and started getting anxious when I saw a whole lecture dedicated to variables. I imagined I would address students: “friends, x=1; that’s a variable.” That would take 5 seconds. What should I do to fill the remaining 64 minutes and 55 seconds?

blog_empathy_header

At the same time, I knew that an intro to python class should be structured this way, and learning the basic concepts of programming would take time. The main problem I had is that I was not able to put myself in the students’ shoes. I do not remember my experience learning programming (which was also shocking to me.) I was also introduced to programming prior to going to college, which removed me further from the experience of students exposed to programming for the first time in the formal setting of a college course.

This created a problem. I was not able to reason about what students need and what the best approach and material were for their learning process. Worst, I was afraid that I would be so unsympathetic that I would belittle their struggles and problems through the course (which, unfortunately, is something many instructors fall victim to.) We hear of many cases of professors telling students that “computer science is not your cup of tea” because these professors did not relate to the struggles and experiences of a newcomer student.

So, that led me to start embarking on a quest to understand these struggles. And found that the best way for me to remind myself of these struggles is to learn something new myself! And I am not saying something new as, oh, I am going to read a research paper on programming languages instead of data management. I mean TOTALLY new. I started taking beginner lessons and reflecting on these experiences as well as recent past experience of me learning something totally new. I will talk more about one of these experiences that affected my teaching the most: learning to paint.

Until recently, I couldn’t even draw a normal-looking stickman. When someone saw my drawings accidentally, they would mistake it for my then three-year-old child. So, it was perfect for my mission. I signed up for lessons. We started by learning basic shapes and working our way through understanding lighting/shadows to drawing more complex objects. I was surprised by the many analogies I found with the experience of beginning students in programming and computer science. Here are some of these analogies.

1. Analogy 1: The 10-year old who is taking the beginner class with me

In some of the painting lessons, I was joined by a 10-year old. She was also taking the beginner class, but she was not the same kind of beginner I am. She has been drawing as a hobby since she was a small child and could paint beautifully, picking up what the instructor is teaching swiftly and perfectly. Compared to her, I was struggling! For each task, I spent a much longer time, and couldn’t do it right half the time! I felt embarrassed and couldn’t help but wonder what she thinks of this struggling adult many times her age. I questioned that maybe this is not my place and I should just leave. And it hit me! This is what our students feel when they see the “star” student who is answering all questions perfectly and jumping ahead to topics they would not see until their senior year. They feel out of place; as if they do not have what it takes to be in this program, just because someone else had different circumstances and experiences of getting a headstart.

2. Analogy 2: It is fine to make mistakes, just debug it and fix it!

It turns out that the painting process involves a lot of “debugging”. My naive self thought that a painter would just do their magic with no mistakes! I immediately remembered my frustration when I and the TAs found that students were not comfortable making mistakes and debugging/fixing them and that it was not natural to them. But, I found myself doing the same thing. Working extremely slowly on drawing a sphere, extremely afraid of pressing too hard, or drawing a disproportionate line. And when the instructor would say “do not be afraid of getting mistakes”, well, I was still afraid and clueless. I did not even know how to go about fixing the problem. I only started getting it when I saw the instructor “debug” their drawing in front of me and after doing it a countless number of times. I realized that debugging itself is a skill that is not obvious for students and requires a lot of practice even for the simplest tasks.

sphere

3. Analogy 3: the milestones that keep you going

After a few painting lessons, I got really stressed with my progress and felt completely lost. I am not getting the shapes right, and the shadows look as if they came from an alien world where the laws of physics do not hold. No matter how much I try to focus and try to attend to each detail in the process, I could not make it work. That was another moment where I thought that I should just give up. But something happened! Out of that frustration, I was able to draw my first object that had correct proportions and shadows! That moment created a boost that made me able to go on. Getting such a boost (and sense of accomplishment) was surprising as it was just a sphere! This realization of the importance of these milestones where the student feels that something clicked and started making progress was revealing. Students would often feel frustrated with not being able to get a concept. But, this frustration is a sign that they are now learning something completely new. This shows the importance of being with the students and making them aware that this frustration is a normal part of learning rather than a signal to leave.

There were a lot of other small analogies and experiences that really helped me relate to students starting to learn programming and computer science. Learning a new concept is an isolating (stressful) experience with results that a newcomer cannot expect. One important realization is that the instructor should provide the time and space for students to work on their learning process while providing the support, encouragement, and perspective of what is coming next. And critical to our field, we should create a safe environment with the assumption that everyone is here to learn; we are not only catering to the top performers. Relative learning differences, in the beginning, should not discourage students from continuing and enjoying the journey.

The Case for Underground Research

Do you enjoy being one of 5 other headbangers in your local band’s gig (2 of which are the bassist’s parents)? Do you still read the poems of a schoolmate you hardly knew 20 years ago who is surprisingly still writing despite never being published? If you are like many others who find value and enjoyment in underground and alternative creative work, then why aren’t you doing the same for research work?

Underground Band

If you ask any researcher “how do you find papers to read?”, the answer is typically going to be by following top-tier conferences and journals of their field. “Top-tier”? That’s another way of saying mainstream. Conformists! If you question such a reader, they will barrage you with reasons such as being higher quality and impactful work coming from the strongest groups in the field. And I understand. This makes sense. But, like listening to mainstream music and McEating in an international franchise, there is something artificial about research work published in “top-tier” venues.

I started thinking why am I getting this sense of artificiality when reading from (and writing to) top-tier conferences. And I think that this bar that is made to make sense of what papers to get in and what papers to let go has created both superficial standards that are concerned with aesthetics and elegance as well as contrived standards that are concerned with contribution and novelty. And, again, I understand. The superficial standards are reflecting our inner desire for papers that are enjoyable to read. And I understand the need for high standards to let in the works that push the boundary of science.

However, the aesthetics and elegance factors are typically reflecting our unconscious desire for papers that makes us comfortable by conforming to the research community’s standards and ways of thinking. And the contribution/novelty standards are typically translated to finding niches of work that no one explored before. The danger in these two implications is that what is happening now is “paper engineering”. I do not know whether papers published in top venues are the best work in the field. But I know that they are the best-engineered papers. When I write a paper, I think about solving an interesting research problem. However, I feel that this is for myself. To get the paper published, I think “How can I paper-engineer this?” Unfortunately, this sometimes leads to losing (or obfuscating!) some conciseness, clarity, and simplicity of the original work.

Other than these finely paper-engineered works you are used to seeing in top-tier venues, there is another species of research work; let’s call them underground or alternative papers. These are papers that are written by authors without the goal of publishing in top tier venues. Authors who are doing research work for the enjoyment of doing it or as a pedagogical exercise in educational institutions. In these works, the authors have worked on a problem that they found interesting, found a solution that works, and sent to a venue with near 100% acceptance rate. They did not care how to show it is different from that other paper that was published a year ago. They did not care to contrive a complex model of thinking and placing their problem and solutions. And most importantly, they did not taint their work with needless complexity and “depth” (some might be due to the infamous reviewer 2!)

When you read an underground/alternative paper, you get a problem statement and solution that works. It is easy. It is straight-forward. And the insight is at your face with all its ugliness and simplicity. It did not do a paradigm shift. It did not open a new field. It solved a problem. And if you work on that problem, that’s what you need. Yes, the paper might look unappetizing, filled with typos and grammatical errors and figures with labels written in comic sans. But, you get that “realness” that is lost from mainstream “top-tier” papers.

I am not saying stop reading and publishing in top-tier venues. I still think that this is the right medium for such activities. But, do not miss the crude beauty and raw nature of underground/alternative research. And above all, do not forget to support your struggling local researchers.

الأكاديميون حراس البوابة

مؤخرا أصبحت أتذكر تجربتين مررت بهما لمشاريع قواعد بيانات يقودها أساتذة زائرين في شركات تقنية.

goal-3144351_640

الأول في شركة, فيها موظفون و طلبة متمرنون يعملون على مشروع لبناء قاعدة بيانات و يشرف عليهم أستاذ من جامعة كبيرة (فلنسمه أستاذ س) يحمل كل الجوائز التي يمكن أن تتخيلها من مجتمعه الأكاديمي. قاعدة البيانات لم تكتمل و أخذت وقتا طويلا لكل خطوة للأمام (هذا إن لم تأخذ خطوات للخلف). لماذا؟ الموظفون الذين يعملون في المشروع كلهم مهندسوا برمجيات لديهم خبرة بالسنوات. لكن التالي كان يحصل في كل اجتماع بينهم و بين الأستاذ س الذي يشرف على المشروع. يقوم بعد الموظفين بعرض تقدمهم و أفكارهم في المشروع ليقوم أستاذ س بنقد و تدمير جميع أفكارهم و التقليل من فهمهم لمواضيع قواعد البيانات. “التعليم في جامعتك يبدوا أنه كان فاشلا لتفكر بهذه الطريقة” و ساخراً “نعم أنت المحق و أنا صاحب هذه الجوائز مخطئ” بعض الأمثلة لما يقوله للموظفين في مشروعه. النتيجة؟ أصبح الجميع خائفا و حذرا من التعليق و العمل للأمام و الكثير منهم كره المجال بالكامل.

التجربة الثانية لأستاذ آخر (فلنسمه أستاذ ص) الذي لا يقل عن أستاذ س في مكانته. الفرق أن أستاذ ص كان له طريقة مختلفة في التعامل مع فريقه من مهندسي البرمجيات. عند استثارة أي موضوع يكون فيه خلاف أو سوء فهم من أحد المهندسين كان الأستاذ يقوم بالنقاش للدخول في تفاصيل الموضوع. النقاش كان يدور على مستوى واحد فيه مناقشة للأفكار و الدوافع من دون أن يشهر أستاذ ص شهادته و مكانته العلمية. النتيجة من هذا النقاش هي هي إما أن المهندس قد أساء فهم أحد المواضيع المتعلقة بقواعد البيانات و النقاش ساعده على فهمها بشكل متعمق أو (و هو المهم بشكل أكثر) أن مهندس البرمجيات بناء على خبرته المختلفة و التطبيقية قد وقع على مشكلة أو حل خارج إطار البحث الأكاديمي و خبرة الأستاذ ص. الموظفون أصبح لهم ثقة في المشاركة و العمل و الدفع بعجلة البناء سريعا و الأستاذ ص استفاد بمعرفة تطبيقية أكثر كانت خارج نطاق بحثه الأكاديمي.

السبب الذي يدفعني لتذكر هذه التجربتين هي بعض النقاشات الدائرة مؤخرا و من فترات طويلة في تويتر. في الكثير من هذه النقاشات يأخذ الأكاديمي دور حارس البوابة الذي يمنع الأشخاص الآخرين من النقاش في موضوع تخصصه الدقيق. للأسف أن كثير من هذه النقاشات ينتهي ب “أنا لدي الشهادة المتقدمة إذاً أنا محق”. لكن هذا النوع من التفكير يغفل أمرين أساسيين: أولا أن الشهادة لا تعني أنك محق حتى لو كنت تتحدث في صلب تخصصك و بحثك. المجال العلمي عندما يلتقي بالتطبيقي يؤدي إلى تعقيدات قد لا تكون واضحة عند النظر من منطلق تجريدي. هذا بالإضافة إلى أن التطبيقات للموضوع البحثي قد تكون لها خصائص مختلفة خارجة عن العوام المستنبطة في السياق البحثي. ثانيا, قد تكون محقا أيها الأكاديمي و لكن هل دورك أن تكون حارس بوابة و أن تستفرد بالعمل بالموضوع البحثي؟ كأكاديمي, جزء من واجبك هو نشر المعلومة للمختص و العامل و المهتم بمجال بحثك. و هذه العملية لا تكون بإشهار الشهادة, بل بالنقاش الدائم و المفتوح.

إشهار الشهادة و التخصص لإغلاق النقاش يترتب عليه سلبيات كثيرة. هي تفقد مصداقية الأبحاث لتصبح كتمارين خارجة عن إطار التطبيق. و أيضا, هي تنفر الطلبة و العاملين في المجال من العمل البحثي و محاولة الدخول في المجال الأكاديمي. عندما يصبح النقاش حكرا على المتخصص, فإن الطالب و العامل في المجال لن يستطيع الدخول. و الوحيد الذي يستطيع الدخول هو أحد اثنين: الطالب الذي يوافق المتخصصين في كل كلامهم و هذا ليس المجتمع الأكاديمي الذي يشجع على النقاش. أو الطالب صاحب الكاريزما و المعرفة الأساسية التي توافق المتخصصين مما يدفع بالكثير من الطلبة الآخرين بعيدا بالذات من الطلبة الذين قد يكون لديهم إحساس بأنهم “مختلفون” عن السائد في المجال.

إن كان مجالك الأكاديمي يحظى باهتمام شريحة كبيرة من الناس, فخذها كفرصة لتصبح مؤثرا بنشر المعلومة الصحيحة و حب التخصص و تشجيع الطلبة و العاملين ليتبحروا فيه. لا تكن حارس بوابة يجعل المجتمع البحثي و النقاش العام سامّاً لشهادة و حبر على ورق.

 

Three Things I Did in Class Last Quarter (That Students Liked)

I have been teaching in UCSC for two years now. I have been blessed with positive student reviews for my classes. But what I like more are the wonderful comments and advice that I get from students; ways to help make my teaching better and make the students’ experience in the class better.

university-105709_640

In the previous quarter, I have been thinking about implementing some of the ideas that were suggested by students in prior courses. It turned out to be a success! In fact, reading the course reviews of the previous quarter, many students acknowledged these changes and that they liked it. As a practice of self-reflection and getting feedback from a broader audience, I list these changes here. I would love to hear if you have implemented some of these techniques, or think that some of them might have flaws that outweigh the benefit.

1. Reducing students’ stress by always starting the lecture with a dialogue.

The course I teach the most is the Operating Systems and principles of software design course, which is regarded by many students in our department to be the most stressful course in the program. The reason for this being that we introduce the students to concepts about complexity, large-scale software, concurrency, and the OS kernel as well as having assignments where students hack an operating system kernel. This — as you can imagine — is a daunting list of topics for students to cover in one course.

I was aware that the requirements of the course increase students’ stress. This is something that I wanted to change. And I have been doing a various set of techniques to accomplish this, but the thing that I felt had the most impact with relatively small effort is the following: always start the lecture with an open question and answer session. Students can just ask about anything. Whether it is about the material, the coming exams, their programming assignments, or the general logistics of the course. Nothing was off limit and I did not rush answering any questions. Sometimes, this took a significant amount of time (i.e., the lecture before and after the midterm 🙂 — but the outcome is definitely worth it.

I have noticed that doing this approach, I have noticed more students more comfortable to come to my office hours and approach me after the lecture. And a number of students suffering from anxiety-related illnesses have expressed to me that this has made them more relaxed and receptive during lectures. In addition to these benefits to the students, I felt more connected with students concerns and their priorities in each lecture. For example, at one lecture I noticed many students asking about the high-level ideas of the programming assignments which indicated that the intuition of the assignment was not presented clearly. I took the opportunity to integrate the assignment in the examples I used in the class and many of the students high-level questions were resolved.

2. Increasing participation by *NEVER* saying that an answer is incorrect.

Student participation is very important. It helps students be engaged and clarify points that are not delivered well in the lecture. And it is also a gauge to the instructor to know that the pace of the lecture is not too fast for students. This is a point that I wanted to improve from my previous classes where only around 10 students would participate actively in lectures. I tried being positive and encouraging to students questions in prior courses, but I only noticed small improvement in participation.

What really worked in increasing student participating significantly was an extreme solution — one that I was hesitant about but wanted to try out and explore. This solution is to *never* say that an answer is incorrect. If I receive an answer from a student, I try to come up with the most positive spin to the answer. Sometimes, this can be that the first part of the answer is correct. Sometimes, the answer can be viewed as a step to the complete answer. For example, an incorrect answer can be presented as a correct answer that can be improved. I would validate the answer, describe that it would work in certain cases, and THEN I would mention that there are some corner cases that might not work, and poll students for further participation.

When I started this solution, I was worried that students would be confused by some answers and my reaction to them. Indeed, one student in the first week even told me that he wishes that I would just say that an answer is wrong! However, by the second week, I have noticed that this started to become effective. More students started participating. And many have become comfortable sharing answers that are not necessarily correct/complete and that they would acknowledge this during their participation (For example, “the answer is so and so but I am not sure what will happen in [some corner case]”).

This has been especially wonderful for the class I teach because we teach them software design which is an iterative process. And this method has helped students practice this in each lecture!

3. Doing a midterm evaluation.

The update to the class that was liked and mentioned the most in students evaluation is doing a midterm evaluation. The final “official” evaluation that is done through the university is performed at the end of the quarter and only visible to the instructor after the course is done. This makes the students suggestions useful only to future classes (which might be one reason why participation in these evaluations can be higher.)

In this quarter, I did a midterm evaluation where in the fifth week I sent out a survey to students where they can express their comments and requests about the class anonymously. I received many wonderful comments and requests and a good number of students participated. For example, students pointed out that the assignments were a bit confusing. Hearing about this concern, I started running an assignment-clarity committee that consists of a handful of students in the class that would help me refine the assignment write-up before it is released. (this also had the side-effect that these students would clarify the assignment to other students in discussion sections and on piazza.)

This gave me the opportunity to improve the course before it ends, helped the students feel that they are heard and that the instructor is concerned about their learning process. These, I believe, has other side-effects to make students feel that I am on their side, reducing their stress, and increase their engagement.

Let me know what you think about these changes and whether you have other ways to improve the students experience!