Blog

Adding a Serverless Commenting System with AWS

Mon 01/14/19


My goal for this post is explain how I implemented a serverless commenting system with AWS. The first step was to figure out the requirements:

I started with the easy part first - a form.

<h2>Comments</h2>
<form>
  <input id="name" placeholder="your name" maxlength=16 required>
  <textarea id="comment-body\" placeholder=\"your comment\" required></textarea>
  <button id="submit-button" type="button" onclick="saveComment()"> Submit </button>
</form>
<div id="comments">
<!-- Individual comments will be inserted here -->
</div>;

Next I thought about how to give each post a unique identifier. I contemplated a few complex solutions, but eventually decided to keep it simple: include a script tag in every post to declare a constant.

<script type="text/javascript">
const postNum = 6;
</script>

Ideally I would insert this automatically when the post is compiled by sorting the posts by their dates and counting up.

With the markup done, I moved on to creating an API for getting and posting to comment database.

Creating a Table in DynamoDB

DynamoDB is a NoSQL database that can scale to any level. You can either choose to pay for the amount read/writes that you want to allow per second, or you can let it auto scale, up to a claimed 20 million requests per second. I went with the former, which is called provisioned capacity.

dynamo-scaling.png

I set my table is to handle 2 requests per second for now. This can be edited at anytime, which allows you to start small and increase it as you see fit.

One of things that I struggled with when setting up my comments table is the "NoSQL" bit. I didn't quite grasp that it means exactly what it says: DynamoDB is not SQL. A SQL table has a set amount of columns and robust language to make fast queries on any column.

In contrast, a DynamoDB table has a primary key which uniquely identifies any kind of JSON object that is stored in the table. The benefit is that it allows for any structure of data to be stored in it. The downside is that it makes efficient SQL like queries for non primary key attributes impossible.

If I were doing SQL, I would make a table like this:

id post_number comment_name comment_body time_stamp
1 6 nicolas cool post 1/14/19
2 6 nicolas sql is better than nosql 1/14/19

And query for it like this:

SELECT time_stamp, 
       comment_name, 
       comment_body 
FROM comments 
WHERE post_number = 6 ORDER BY time_stamp;

The problem with making this table in DynamoDB is that posts must be pulled by their post number, and then ordered by their timestamp. I wouldn't be able to query by the post number because id is the unique primary key.

If I wanted to use this schema, I would have to use the following algorithm:

  1. Pull every comment in the table
  2. Make a list of all the comments of the post
  3. Sort this list by its date.

I needed a primary key that would always be unique, which I could also use to get all the comments for a specific post.

I pulled up the DynamoDB docs, and learned about composite primary key's. A composite primary key is combination of a partition key and sort key. Multiple items may share the same partition key, but their sort keys must be unique. A list of rows can be pulled by their partition key, and they will come out ordered by their sort key.

I dropped the id key. I replaced it with composite key: post_number being the partition, and time_stamp being the sort key.

comments-table.png

Note that the comment_body and comment_name columns are not specified. I can actually submit data with any key to this table and it would create a new column for it. The only requirements are that post_number exists and that time_stamp is unique.

Making API calls to Lambda for DynamoDB operations

With the comment table setup, the next task was to create some back end functions to perform read/write operations. I chose to use Node.js with Lamdba / API Gateway to accomplish this.

lambda-triggers.png

Writing the Node.js to communicate with DynamoDB was straightforward, and before long I had two routes that I could call from cURL that saved and retrieved comments:

Get comments handler

Post comments handler

Next I added some JavaScript to handle the submit button:

 function saveComment(){
  fetch('https://l4oejeyzok.execute-api.us-west-2.amazonaws.com/default/post_comment', {
    method: 'POST',
    body: JSON.stringify({
      postNumber:postNum,
      commentName:name,
      commentBody:comment
    }),
    headers: {
      'Content-Type':'application/json'
    }
  }).then(response => response.json())
    .then(data     => displayComment(data))
    .catch(err     => console.log(err));
}
 function displayComment(comment){
  const comments = document.getElementById('comments');
  const date = new Date(parseInt(comment.time_stamp.N))
  const year = date.getFullYear();
  const month = date.getMonth() + 1;
  const day = date.getDate();

  const newComment = document.createElement('div');
  newComment.classList.add("comment");
  newComment.innerHTML = `
    <div class="comment-name">
       <strong>${comment.comment_name.S}<span class="date">${month}/${day}/${year}</span></strong>
    </div>
    <div class="comment-body">
      ${comment.comment_body.S}
    </div>`;
   comments.prepend(newComment);
}

I hit the submit button and a yellow message popped up in my console: "Cross-Origin Request Blocked". I discovered that my browser was protecting me from fetching resources from a different server than the page was hosted on.

On a classic web server this isn't a problem because the backend files are in the same domain as the html. At this point I hit the biggest roadblock that I faced in the project - figuring out how to get around this.

I learned that cross origin requests can be allowed through CORS, or Cross Origin Resource Sharing. This can be enabled on the server that the resources are being requested from. At first I thought all I had to do was add a header to the lambda response: "Access-Control-Allow-Origin" : "*"

This means "allow any website to request this resource". I could change the * to nicolasknoebber.com, but I test this often from localhost, so I chose to leave it as the wild card.

I went back to the AWS docs, and eventually found this article. In addition to the Access-Control-Allow-Origin header, I would need to create another method in API Gateway, a so called "Preflight" check. Luckily, API Gateway automates this process.

api-gateway-cors.png

So when a script in one of my blog posts makes an API call to AWS, it will first send an OPTIONS request, which API Gateway will respond back and say OK, this CORS request can go through. After receiving this reply, the actual POST request will be sent out that saves the comment.

Finishing up

The rest of the project came together quickly once I was able to contact API gateway from local JavaScript. I added another fetch to get all the comments with the postNum constant as its parameter.

I added a few basic anti spam measures. While I could of used Captcha, I would rather say no to having my readers train Googles AI.

  • Prevent the same comment from being submitted twice by using a JavaScript Set
  • Disable the submit button while a create request is still asynchronously processing
  • Sanitize comments from being saved with HTML tags

Here's the final script for client side comment handling: comments.js.

More importantly I have auto scaling set to off in my AWS services, so my bill won't get large if a spammer does target me. For my entire AWS stack (S3 + DynamoDB + Lambda + API Gateway + Route53) I still pay only $1 a month.

powered by  26.3 9.2.511/17/19