I’m pulling conversations from a SLACK channel, and I’m running into several problems trying to break it down into multiple JSON files to work them out individually because it’s too much text.
First, for some reason, the Slack API brings duplicated messages in the output of the conversation, so I need to later clean this, but this causes difficulty in grouping this data, because I don’t know if a lot of the duplicated data is from my mistakes in the scenario or from the source.
I still haven’t been able to solve this fully, the issue is that when I try to break down the JSON either (A) important text goes missing (too little text) OR (B) a lot of text gets repeated in different chunks (still too much text).
I’ll try a bit more before posting the solution I came up with, but still not ideal, essentially I kept the repeated text on each chunk, and I then sent each one to a google sheet row, and I ran a custom Google Sheet Script to find and clean duplicated texts.