Hi Make.com Community,
I’m currently working on a project where I need to parse a JSON payload from DeepGram to create structured data from the transcript string within the JSON. Here’s an example of the JSON payload I receive (the specific string I am targeting is the value of the “transcript” key):
[
{
"statusCode": 200,
"data": {
"results": {
"channels": [
{
"alternatives": [
{
"confidence": 0.99902344,
"paragraphs": {
"transcript": "\nSpeaker 0: He got the business of making money. But Oliver and I, we are like, the business makes money. Like, we took home over 300, 000 in profit last year, so we are making money, but I can't we can't get them to see that if we're not all on the same page. So so she she had given away almost $30 worth of product. So I had sent that to Marie and said, hey.\n\nSpeaker 1: Like, here's what she gave away. Can we put it in the tax\n\nSpeaker 0: write off? Can we write it off? Like, obviously, companies are allotted a certain amount of money for gifts, and you can make that work. But then I give it to her, and it doesn't really go anywhere. And so then it's just kinda like, that it that it has to matter in the final number.\n\nLike, I come from a retail world. I come from buying and marketing and merchandising before I became an assistant, and so it just doesn't really ring with me because it's I know it's not being done right. And I think where it\n\nSpeaker 1: would maybe help is if once we speak with you, maybe Jamie will hear it more from you. Like, look. I think this and this. Because I had brought it to\n\nSpeaker 0: her, and she said, well, that's just an accounting thing. Like, we can fix that. But it's not really if we're not all on the same page. Anybody's thoughts?\n\nSpeaker 1: That's what I've heard. I'm new.\n\nSpeaker 0: I'm new. I'm I'm just learning. So So, Oliver, what do you think on that?\n\nSpeaker 2: Okay. So I I guess my my thought process my thought is this. It's not really clear to me where the business is gonna go and where the money's going and where the money's coming from and stuff like that. I know that you and I met you and I, Jenny, talked about, and we've successfully transitioned things into a situation whereby the company is able to pay for its own stuff. Okay.\n\nRight? And, I mean, just so you're aware, the easiest way for us to do that because from the get go you know, Again, I'm not sure how familiar you are with Shopify, but the way that it works is someone a customer pays for their item. And if they pay via credit card or, like, a traditional, you know, format, it then gets put into Shopify's, like, you know, financial I don't know. You're basically, like, your kitty, and then they pay it out to the account that you provide them with. If they pay via PayPal and you have PayPal connected, it goes into a PayPal account.\n\nAnd then and then the, business owner can then transfer the money from the PayPal account into their bank account account if they wish, or they can just use the PayPal account to pay for things. So that PayPal account as of the start of last year, as in, yeah, the start of 2023, had something like what was it? Like, almost $200, 000 in it.\n\nSpeaker 0: Right? $236, 900\n\nSpeaker 2: in something. Just accumulated over time. And so I said so we had a conversation with Maria, the accountant. And at this point, remember, we had kind of Jamie was like, COVID's over. I've got a ton of stuff to do.\n\nI'm happy to post I'm happy to post online and promote the business. But Mhmm. You know, I kind of I need to separate myself a bit more from it. So we had a conversation with Maria, which was, listen. Let's use the PayPal account.\n\nAccount. Let's not worry about it because we had never been given access to the direct bank accounts. I'm not even sure if the bank account itself is separate from Jamie's name. Like, originally, the bank account that was used, the City National Bank account that was used to collect the money from PayPal was actually Jamie's personal account.\n\nSpeaker 0: Because they were underwriting everything.\n\nSpeaker 2: Correct. But also because there was no other account.\n\nSpeaker 1: Like, it was just that\n\nSpeaker 2: we needed an account. There might be I believe there is 1 now, but we still don't have access to that, which we which we don't really need because there's enough payments coming through PayPal just to sustain the business costs. So Mhmm. What I said what we said to Maria was, why don't you know, whenever it reaches a certain amount, like, let's say, a 100 k or 75, whatever, basically, let's try to leave $50, 000 in the account. And and then that way, Jenny can use that as operating costs to buy\n\nSpeaker 1: a new product or if something goes wrong or\n\nSpeaker 2: she needs to hire somebody or whatever. And so that's kind of how it's been operating. I we I then applied. Like, it's a business PayPal account, so I applied for the business to have credit cards, which they have, and set up payments through PayPal directly for a lot of the other services, like the apps and stuff that they were using on Shopify and the Shopify fees, which means that we can kind of track it better rather than just basically taking Shopify's word for it. We can actually see what has been charged.\n\nSo all of that is great. Then Jenny and I were chatting about, okay. Like, it would be good for Maria to have access to this. Maria's the accountant. So that she can at least see because she doesn't have access."
}
}
]
}
]
}
}
}
]
My goal is to parse this string to create two variations of structured data from it:
- Structured Data by Paragraph: I want to break down the entire transcript into individual paragraphs. Each paragraph should be a distinct element in the structured data, making it easy to reference and analyze specific sections of the conversation. For example:
[
{
"paragraph": "He got the business of making money. But Oliver and I, we are like, the business makes money..."
},
{
"paragraph": "Like, here's what she gave away. Can we put it in the tax..."
},
...
]
- Structured Data by Speaker: I need to extract and group all spoken content by each speaker into separate objects or arrays. This will allow me to see all contributions from a specific speaker in one place, facilitating analysis of individual speaker’s comments and interactions. For example:
{
"Speaker 0": [
"He got the business of making money. But Oliver and I, we are like, the business makes money...",
"write off? Can we write it off? Like, obviously, companies are allotted a certain amount of money for gifts...",
money...",
"Right? $236, 900...",
"Because they were underwriting everything...",
etc...,
],
"Speaker 1": [
"Like, here's what she gave away. Can we put it in the tax...",
"would maybe help is if once we speak with you, maybe Jamie will hear it more from you...",
"Like, it was just that...",
etc...,
],
...
}
I’ve been trying to use text parser modules to identify and separate speakers and paragraphs, but I am running into issues. The patterns I’ve used so far haven’t given the desired result.
Here’s mainly what I’ve tried so far:
- Detecting paragraphs using the pattern \n\n.
- Detecting and grouping speakers and their content using the pattern (Speaker \d+:.*?)(?=\nSpeaker \d+:|$).
- Adjusting module settings:
• Global match: Yes
• Case sensitive: No
• Multiline: Yes
• Single line: No - Refining regex patterns to accurately split text into paragraphs and identify speakers.
- Testing different JSON payloads to ensure consistent pattern matching.
However, these patterns are not producing the correct structured output. The text parser modules either repeat operations or fail to group the data as expected.
Could someone provide insight or suggestions on how to effectively parse this transcript string to achieve structured data? Any advice or examples from similar use cases would be greatly appreciated!
Thank you in advance for your help.