Java Developer · 3-7 years
2
Hey Deepak, thanks for jumping on! Quick heads-up — I'm an AI interviewer, so this'll feel like a normal conversation but with me on the other end. How's your day going so far? Everything loading okay on your side, can you hear me alright?
Yeah, everything is loading. I can hear you clearly. Thank you.
Great, glad that's working. So let's just dive in — six years of Java experience is a solid chunk of time. I'm curious, what's the most complex Java system you've worked on recently? Just give me the broad strokes — what it did, what made it interesting.
So recently we are working on, you know,
working on an application that do like
The hiring processes, we, you know,
We post some jobs, we get the resumes, we select the, you know, suitable candidates and then we conduct the AI interviews and then we will just, you know.
Get the result.
analyze the interview and let AI, you know,
Analyze the interview and give us a
You know, result for the result for that interview and then after that we compare it and you know.
We do the complete the hiring process.
Oh that's kind of meta — you're being interviewed by an AI system that sounds similar to what you built. That's kind of funny actually. So walk me through the backend a bit — what does the Java side look like? Like what's handling the core logic there, is it Spring Boot, something else?
Yes, yes, yes, we have, we are using Spring Boot as a framework.
Yeah, Spring Boot's the go-to for that kind of thing. So with Spring Boot handling the backend — what does the service layer look like? Like when a resume comes in, how does that flow through the system? What are the key components you personally built or owned?
So, like, whenever a resume, you know, anyone uploads a resume in our system,
We passed that resume using LLM.
And you know, extract the key points. We have created a JSON object in which, you know, we want to get the response from the LLM and then we, you know, persist it in our database.
Right, so you're basically structuring unstructured data with an LLM and then persisting that. That's a pretty common pattern now but there's a
Yes
Yes
lot of interesting stuff that can go wrong there. What happens when the LLM returns something that doesn't match your expected JSON structure? How did you handle that on the Java side?
No, we have like specifically trained the LLM over there with our prompts and you know, we need, you know, response in that structured way only. The some key points that LLM is unable to decipher or get from the resume.
Send it as null to us and then we can, you know, check in our system if it is a pivotal point or not.
But there, but you know, we have strictly, you know.
Guided our LLM to give us the response in a structured way only, in which we want.
Got it, so you're relying on the prompt engineering to enforce the structure. But I'm curious — did you ever hit a case where despite all that, the LLM just returned something malformed or unexpected? Like in production, did that ever actually happen?
No, no, till now when I have tested it, we have like, we upload resume in, you know. in. Bulb.
Like 20 to 50 resume and I have not for now, you know, encountered the case where
LLM does not return the response in a structured way, though there are some cases like
Like what we have done to reduce the cost for per prompt, we have just, you know,
We first use a PDF library to extract the information and then we send it as a text instead of the attachment, but some resumes that come to us are with, you know, attached, they are in the image in the PDF as a PDF, but they have an image in the PDF and now PDF library cannot decipher it. So those resumes we send as it is to the LLM and we get the response.
Oh yeah, that's a classic edge case — scanned PDFs are always a pain. So when the PDF library fails to extract text, how do you detect that? Like how does your code know "hey, this one's an image-based PDF, I need to fall back"?
Yeah, there where we have to, you know, exception handling because in the library only, the library we are using.
Itself.
Let us know that the image based PDF is there and then we fall back to the cloud mechanism, the LLM mechanism only where we send that complete PDF as it is to the LLM and get the response.
Right, so the library throws an exception and you catch it and route it differently. That makes sense. What library are you using for
Yes.
We are using PDF box.
Oh yeah, Apache PDFBox, solid choice. Okay so let's zoom out a bit from the resume parsing piece. You've got Spring Boot, you've got LLM calls, you're persisting to a database. What database are you using, and how did you model the candidate data?
We are using post grey.
PostgreSQL, we're getting a, we are using like a structured database and we have, you know, we have multiple tables for it. Like first we just queue all the resumes and then we in a table we, you know, save the status of the resume if it is, if it is being passed or encountered an error so that we can display it on the front end and retry, you know, uploading that resume again and then we at last save the structured response that we get from the LLM in the candidate parse item table.
where we, you know, just store mainly the
The basic and the key points from the resume that we need, you know, to find the suitability of a candidate towards the job, you know, like and some basic information that we need to send the invitation, to call and to communicate with the client to the candidate. So we store all this information only.
Right, so you've got like a queue table, a status tracking table, and then the parsed candidate table. That's a clean separation. One thing I'm curious about — you mentioned queuing the resumes. Are you doing that queue in Postgres itself, or is there a message queue like Kafka or RabbitMQ involved?