Webinar “Personalized Semantic Search using the Digital Systems Engineering Process Model”

Alexander Turkhanov
17 min readAug 31, 2023

The problem with staffing an engineering project

A few months back, the German chapter of INCOSE delivered us the digital system engineering process model. And they proposed a specific use case for that. But I believe it was the most apparent use case, not the most useful. They proposed the model as a device that helps you navigate through the System Engineering Handbook instead of inconveniently looking for necessary information in a paperback version. But when I was looking at it, I could not help but think that this is a semantic core for the system engineering body of knowledge. And what can you do with the semantic core? You can build a search and personalization engine. And we at Applied Knowledge Systems just decided that we should totally build that engine. The only question was what this search and recommendation engine should look like and what we are looking for. Which also was obvious. We need to look for system engineering skills and for people who have these skills.

How does one do it? To me, it looks like this. For example, we have a system engineering management plan on the input. It’s a part of project planning. And once we have a plan, we can look for people with the skills we need to execute this plan. Should be as simple as that — plans on the input, people candidates on the output. The reality, though, according to the process model presented in the Human Resource Management process of the handbook, has a cumbersome, very tricky, and very time-consuming structure. It starts with project planning, and as a result, we have a Project Portfolio and Project Human Resources Needs. Then, we identify, develop, and acquire and provide skills. On outputs, we have a human resource management plan, qualified personnel, a human resource management report, and a human resource management record. Nothing that looks like the process I have in mind. It’s really time-consuming, and we can eliminate a lot of waste in this area, precisely thanks to the systems engineering process model and role-based search capability we built.

If we are successful, INCOSE as an organization or for Indra or other engineering companies that have system engineering as a part of their operations can streamline their human resource management process, and that will not require getting tons of approvals, building connectors, and changing informational security profiles because we built privacy-first AI assistant that does not send your data outside the already approved network or your workstation. I will explain how it works and show you the prototype. My ask from you is simple. So far, we have done all the difficult jobs of building the platform and data processing pipeline. We can now build a product that actually solves your problems. It’s a fun task you can all participate in.

How searching for skills should look

I want to thank the Spanish chapter of INCOSE and specifically Anabel Fraga, who helped me a lot in organizing this webinar. I also want to thank Alexander Mikhalev, the CEO and developer of Applied Knowledge Systems, who spent a lot of time building the Terraphim search engine. I am both product manager, head of product, and project manager in this small company, as it usually goes in start-ups. But in corporations, we also execute several roles directly related to this webinar’s subject.

When you search for something, you do it differently from the perspective of a CEO or from the perspective of a project manager. If I am a project manager preparing an outline project plan, and I look for some topic, I need a clear and concise explanation that I can use to formulate project tasks that will go well with the project management office. If I am a CEO and use the exact keywords in the query, for example, “Forms of contracting for product companies,” I most definitely need practical guidelines and actual contract templates, preferably with comments from the lawyer for a specific jurisdiction. But search engines do not support that role distinction; maybe only Perplexity provides a decent experience. But even with it, you still need to think about how you will search as a CEO or how you will search for something as a project manager, and it just supports further queries when you know what you are looking for and why.

Imagine such a scenario. We have a project plan a system management plan, and I’ll take the actual task description from the Notion workspace of our Innovate UK project, where we have forecasted time sheets, registries, and reports, as well as deliverables and milestones. I put this task description to my personal assistant. It investigates the system engineering digital process model and hints to me that OK, I need project management skills to accomplish that task.

It gives me a description of this skill, so I understand what exactly stands behind it. This is really a machine-generated project description of that task. This is not mine, and it’s actually good. OK, I read it, and it seems legit. So this is a proposal to collaborate generated from the task, and I respond OK, I approve this skill description for my project. What should be the next step? I think you should add some supporting skills descriptions because it doesn’t go alone. Project management skills also require risk management, stakeholder management, communication skills, and problem-solving skills. Again, these are all skills that have been proposed by the machine based on that proposal to collaborate. She generated them, and I said, “OK, can we be more specific?” And it goes to the curated WAND taxonomy. WAND company has hundreds of them and elaborates on the skills descriptions using these taxonomies. We have a project management taxonomy containing more than 1000 terms. I ask myself — what are the areas of project management I am most interested in to manage this task? It’s not completion, it’s not documents and records, it’s not execution, it’s planning and design. I selected it and realized I don’t need design; I need planning and methodology. Specifically, I need a critical chain method. Using WAND taxonomy, I substituted abstract and broad project management skills with narrow and specific “critical chain” skills.

It does the same with risk quantification, stakeholder analysis, and communication management plans. So now, from a generic task description that is just three lines of text, I have a pretty concrete project proposal with pretty concrete skills. I want a candidate to have all of them. And I say, ok, find me the person that can be a good fit. It goes into my contact list, creates a new page in my Notion workspace for this project, and sends me the link, and I see this resume. Again, this is a machine-generated resume, and it is pretty good. I look at the timestamp of the resume and understand that it was generated one minute ago. The person whose skills are described in this resume has spent not a minute, no time at all, zero effort to write it specifically for this particular task proposal, the machine has generated it, and it contains actual information because this person has a lot of personal GitHub commits, a lot of Notion documents and Discord messages. We have a lot of information about this person, and the machine just fetched it and generated a resume from the original working products the candidate has created.

And what’s the beauty and benefits of generating proposals to collaborate and generating matching resumes from the actual work products instead of the standard human resources process I described above, with project portfolio and human resource needs, acquiring skills, and qualified personnel at its outputs? Such an approach based on generative AI and semantic search finds better candidates for the job and removes over-secured search for usually overqualified personnel. It also supports each statement in this resume, and it is not just lines of text; it’s not just a string. Every sentence is actually a link, and I can verify the evidence in GitHub in Notion to check how valid each statement is in the resume. Is it a real thing or not? If unsure, I can ask, “Please generate a controlling question for this statement.” So if the person says that she made realistic project plans, OK, the question would be, “How did you make sure your project plan is realistic?” That’s a legitimate question. Or you can ask, “How did you attend to the project’s performance baseline?” The project performance baseline actually should be completed, so that’s how it works. That’s how I can arrange the assessment of candidates.

But what is the semantic search? How all of the above is related to the topic of the webinar? As you can see, we are looking not for strings but for the resume’s matching keywords. We are looking for confirmed skills the candidates obtained. And when I am looking for skills, I am looking for things. And that is a definition of semantic search. What’s important here is that when you have things as a result of your search, you can organize them into taxonomies.

Context of searching for skills

That’s quite unusual turn, one can say. What’s a taxonomy? If you go to the Amazon website, you see them on the left sidebar. If you are looking for shoes, you will see shoe taxonomies on the left side, being that taxonomy of shoe color and sizes, brands, prices, design, and purpose. It’s all organized in these hierarchical structures. Such structures are taxonomies. They significantly improve the search experience, as we know from online stores.

Do we use taxonomies when we search for talent? Not very much. Take LinkedIn, for example, where skills are listed in a plain list; you scroll them down, and no structure helps you navigate those skills. This is a consequence of not grounding skill descriptions in real-world persons. On LinkedIn, you search for strings, which are very difficult to classify, and put a taxonomy structure on them. SFIA framework or professional standards like systems engineering competency framework provide more guidance, but real professional profiles usually have more specific names for skills that are not always aligned with the taxonomies described in the standard. And anyway, a significant share of companies use plain lists to manage their skills. But we have a prominent workaround for this gap — we can use the process model as a proxy for skills.

Systems engineering has a process model that can be used as a proxy for the skills taxonomy. And this model we have in the machine-readable form. We built a demonstration that uses the system engineering digital process model to augment the systems engineering skills search. The model contains four processes: technical, technical management, organizational project-enabling, and agreement processes. Each process has activities, and obviously, each activity and process requires relevant skills. It is a hierarchical structure that is aligned very well with the systems engineering competency model as a reference catalog of professional skills. As I stated in the beginning, all the terms from this model comprise the semantic core of the systems engineering domain. That’s a sound semantic model that can augment the task of searching for skills. Let’s return to the use case I started this talk with.

If we use a systems engineering management plan as an input to the semantic search of the systems engineering skills, then such taxonomy could prove itself useful, increasing the skills search relevance. The task of searching for skills is often overlooked. For example, the famous teambuilding framework introduces five stages in the team life cycle — forming, storming, norming, performing, and adjourning. But before you can start forming any team, you need to discover who should be in it. You need to be able to search for relevant skills, generate proposals to collaborate, and, finally, onboard team members. The project manager, the chief systems engineer, or any other technical leader should bring them together first. Also, there is more to the task of staffing an organization, as one should bring team members together and stakeholders.

This task can prove to be difficult, as once the project proposal gets approved, the timeline to arrange everything, find the right people, build up connections with stakeholders, and establish proper organizational interfaces is usually tight. When they give you the money, they want results fast. That’s the way it works. And people are not always at our immediate disposal, so you need to find them quickly. The average project has hundreds of tasks, each requiring several skills, so we are talking of hundreds of skills required to complete the project. A qualified system developer or technical leader has over a hundred skills, with over a couple dozen active and 60 to 80 passive skills.

It’s impossible to know what everyone on your team does and find the best match quickly for all the tasks, issues, risk containment plans, audits, and gate reviews, especially for projects using novel technologies or during the high churn of personnel. If you look into a typical task tracker, you will find a significant portion (from my experience, I’d say at least 15%) of tasks and issues without the responsible person assigned. Also, you will find that many tasks are assigned to the same very qualified engineers, leaving them overloaded and being the bottleneck, and some other staff sometimes without much to do in the downstream development processes, especially at the beginning of the project. Finding proper and economically viable skills in the engineering organization is not a trivial task. We often tend to lean on a few highly qualified and well-compensated people.

But the problem extends far beyond just staffing the project. What if we need to find the right participants for the customer meeting? Or what if we hire a new system developer and are required to gather the interview panel and define the onboarding sequence? How do we find the people who are best fit for this job and have enough time? Because the costs of meetings and new hires with traditional means are staggering and are often well over a few thousand dollars. Is the agenda for such a meeting aligned with the competence profiles of the participants? Are they competent to make the decision? One of my favorite examples is responding to a request for a proposal or starting a new project. We usually have a very tight response timeline when we get an RFP. At best, it is a couple of weeks. That means you must find all the people you need to prepare the response in two days.

To quantify the problem of skill search, I propose you perform a small experiment with your LinkedIn profile. LinkedIn has a feature for measuring your social selling index, the SSI. This index has four components. If you are close to an average user, the lowest value components in your profile would be “Find the right people” and “Engage with insights.” It is the hardest thing for all of us. In my professional network, there are more than a thousand people, with most of them, I communicate rarely and would appreciate assistance to learn better what their skills and experience are when I get in touch with them.

That is precisely what we are building in Applied Knowledge Systems — an action-centric privacy-first personal AI assistant that can fetch you proper context depending on your next actions. You can feed it a project plan from your Notion, a product roadmap from GitHub, or a meeting invitation in Gmail, and it will generate your messages with invitations to collaborate and recommend the right people from your contacts list. Such a solution does not exist right now.

Also, we all have been facing search quality degradation for the last few years and the loss of privacy, which is of high concern to us. That’s why we are building Terraphim as an AI assistant that is entirely under your control. And here, we cannot rely on companies that ignored the privacy and degrading search for years to deliver a proper solution. Also, big companies usually build mass-market solutions that do not account for industry-specific needs and aspects. This is because such niche market segments do not justify the development costs for them and also because the users will never provide the company-specific data required to customize the solution. All tailoring that big companies can offer us comes in the form of prompt engineering, which, more often than not, is not sufficient and too time-consuming for customization.

Pareto-best work-to-skills matching

What can we expect to see if we successfully resolve the problem? According to Gallup reports, many people are miserable at work because they do not use their abilities to the full extent or are overloaded, precisely two things that have the same root cause of the biased work-to-skills matching process. The numbers of workplace dissatisfaction are staggering. If we can match work to skills better, we will see an improvement in workplace satisfaction as most occupied professionals will be less busy, and people who are not using their talents to the full extent, will apply more of their skills.

As I mentioned above, a task or a set of tasks usually requires several skills, so in fact, we are looking for multiple skills match, and most of the candidates cannot be equally good at all of them, which still should be fine in most of the cases. For example, I am pretty good at requirements engineering and systems architecture, though my major is lifecycle and configuration management, but for some tasks, where my level of requirements engineering is sufficient, I can be a good fit, though there are a lot of people who know it much better than me. However, they could not know configuration management, which is also required to get the job done. Such competence profiles are called “Pareto-best.” We are optimizing the search to match as many skills as possible, but giving the key skills more weight than the secondary. Again, if you use the traditional search, such an approach is impossible. All keywords in the prompt have the same importance, or you can have mandatory keywords, selected with prompt syntax, and others that the search engine can completely omit, reducing the search relevance. You cannot find and properly rank persons with requirement engineering, systems architecture, and configuration management skills, but showing more competence in configuration management first and filtering out those who do not know requirement engineering at all. The search prompt syntax doesn’t allow that. You cannot implement Pareto-best skills search with the traditional technology, but very well able to do that in Terraphim.

The textbook quality of the data we have in engineering projects makes it much more attractive to apply to engineering companies. There is a direct relationship between the dataset quality and the quality of the chatbots. You can go and check how well chatGPT answers systems engineering questions and compare that with its performance in marketing or product management fields. Answering systems engineering questions will be much more reliable because the training data are so good. The only problem is that these data are distributed across multiple sources, and we need either federated search or robust, reliable, and private interoperability between data sources.

Another aspect of the proposed solution is improving the search results’ relevance in small bits. If you have 15 minutes, you can train Terraphim to understand you better. You don’t need a month-long implementation to build the next increment. You build your own language with which you talk to an AI assistant, not studying proper prompt engineering as you need with commercial LLMs, which use their own lingo and will not understand your terms. This is just how they are built. Such lengthy implementations and complicated query languages are not aligned with our workplace reality.

Also, you want to exchange information about skill profiles. You need to map skills descriptions to some widely accepted standards to do that. In this demonstration, we use the SFIA framework, The global skills and competency framework for a digital world, and a few WAND taxonomies (project management, for example) to enrich the original systems engineering process model with more specifics and provide interoperability. The SFIA framework works miracles with chatbots. Try the Bing chat or chatGPT for yourself, and ask them a question about your job experience. For example, copy and paste a paragraph from your resume and ask what SFIA skills you have based on the text, and it will provide a very good starting point.

As all these standards for skills use controlled semantics and vocabulary, it is very easy to produce formal activity-centric models from them with different tools, and we apply Apollo 4D activity modeler because it implements the best practices introduced and developed by Mathew West and resulting models can be easily fetched by chatbots. Of course, you can use machine-readable versions of the Object Management Group standards or WAND taxonomies without spending any time on serializing text descriptions of skills into proper JSON. Once you build a formal model for some skill, for example, for the SFIA Project Management, Level 4, shown in the illustration, you can ingest it into the AI assistant, and it can generate dozens of interview questions on the subject of project management. For example, it can ask a user, “What Project Management Tools did you use to Manage Risks when you were working at Company X in 2019?” It takes the formal skill model and the user’s resume and generates a proper question, tagging it with domain terms — Project Management (SFIA skill), Project Management Tools (SFIA concept), and Manage Risks (SFIA concept). After it gets the user’s response, it transcribes it and tags it with the same domain terms. So, when you search for these keywords, you will get a match for confirmed skills that are correctly placed under the SFIA skills taxonomy.

Role-based sematic skills search

So, on one side of such a semantic model, you will have formal structures, taxonomies, and ontologies, and on the other, it will be a natural language response coming from subject matter experts. You can keep both particular professional lexicons in your response to the AI-generated question, which will still be searchable and discoverable because it is mapped into the controlled vocabulary of the systems engineering digital process model, WAND taxonomies, or the SFIA terms. AI assistant takes natural language input and tags it to skills taxonomies, which gives you the power of formal models for search but doesn’t require much effort to build such models.

But we went even further. Tagging the content significantly improves search relevance, but you will use several thousand formal terms from about 15–20 taxonomies in the average project. Every piece of content will be tagged with different terms from different domains. For example, the same project proposal will be tagged at least from the project management, engineering, finance controlling, and corporate compliance. Even for a small project with a team of five people, we will have mixed relevancy ranking just because the project manager, developer, and finance controller will have different preferences for how to rank search results. Different roles imply different relevancy rankings and different taxonomies to augment the model. With roles-based search, for the same search query, project managers will find one person and engineer another because they do different jobs, and thus, with an action-centric approach, they must benefit from varying recommendations.

That’s why we implemented the role relevancy mechanism in the Terraphim AI assistant. Please see this demo of the changing search relevance once the user selects different roles. In fact, Terraphim performs a two-step search — first, we define domain relevance using keywords, and then we define further actions relevancy using a role-based search mechanism. It may sound vague and complicated, but all these improvements come from your note-taking application: Notion, Fibery, Logseq, or Obsidian. We also can process GitHub repositories or Jira issue trackers. You don’t need to do much on top of what you are most likely already doing as part of your daily routine.

The Terraphim role is a combination of three things: each role has its own set of data sources, each role has its hashtags, objects, and role-specific activities, and, finally, each role has preferences in prompts and history of search results. Let’s dive into them. The combination of data sources means that I am in the father’s role; considering the class’ WhatsApp chat and educational platform as relevant sources of information and the project workspace as irrelevant, I don’t want to get any results, even well-fit, from my workspaces. And vice versa, if I am searching for something from the project manager position, I don’t want to get the results from my personal e-mail or family photos folder. If I select the role of a systems engineer, I have specific high-priority hashtags such as life cycle model, system configuration, or concept of operations, but for example, a project’s budget has low relevancy in my search. The situation changed when I switched to the project manager role — project budget became a highly relevant hashtag, as well as work breakdown structure or deliverables, but system requirements do not matter for me a lot, and that affects search results ranking. And finally, the user can approve or ignore search results, fine-tuning the assistant’s relevancy function.

And that’s it. This is the first Terraphim use case we are implementing now. We are streamlining project staffing with engineers by automating steps in between using a controlled dictionary and curated taxonomies we ingest from industry standards and the catalog of WAND company, and we are using the roles mechanism to augment the search relevancy. Thank you very much for your time; please visit our website and participate in testing the early version of the product. Help us build your personal privacy-first AI assistant that fits your daily workflow.

--

--